RabbitMQ clustering with Consul in Nomad

28 Jan 2019

RabbitMQ is the centre of a lot of micros service architectures, and while you can cluster it manually, it is a lot easier to use some of the auto clustering plugins, such as AWS (EC2), Consul, Etcd, or Kubernetes. As I like to use Nomad for container orchestration, I thought it would be a good idea to show how to cluster RabbitMQ when it is running in a Docker container, on an unknown host (i.e. one picked by Nomad.)

I ran into a few problems trying to get this working, but a lot of searching and some help from the RabbitMQ mailing list (thanks Luke!) got me through all the issues, so hopefully, this will be easier next time and for other people too.

It is also worth noting that this is only going to be covering how to make a cluster work, not how to make it secure (setting up TLS etc.) for production usage. There is a lot of documentation on the RabbitMQ website for further reading on this!

The full repository with all of the demo code is available on my Github.

Nomad Cluster

As this post is mostly about running RabbitMQ on Nomad, and not setting up Nomad, I’ll give the basics here - the full scripts are in the repository, and linked below too.

Vagrant is used to create us a three node cluster. As I use Hyper-V for VMs, I can’t set static IPs from the Vagrant file, so I have used another Vagrant feature: triggers.

Triggers let us specify scripts to run after Vagrant actions, so in this case, we run a script after machine1 comes up which writes out it’s IP to the /vagrant share. The other machines can then read this same file to join the cluster:

Vagrant.configure(2) do |config|
  config.vm.box = "bento/ubuntu-16.04"

  config.vm.provision "shell", path: "./provision.sh", privileged: false

  config.vm.define "n1" do |n1|
    n1.vm.provision "shell", path: "./server.sh", privileged: false

    n1.trigger.after :up do |trigger|
      trigger.run_remote = { inline: "ip route get 1 | awk '{print $NF;exit}' > /vagrant/server_ip" }
    end
  end

  config.vm.define "n2" do |n2|
    n2.vm.hostname = "nomad2"
    n2.vm.provision "shell", path: "./client.sh", privileged: false
  end
end

The provision.sh script downloads and installs both Consul and Nomad, and then the respective server.sh and client.sh scripts set up both services in the right ways. The server machine also acts as a Nomad client, so that we don’t need 4 VMs running.

VagrantFile, provision.sh, server.sh, client.sh

All that remains to be done is vagrant up…and wait. But while we’re waiting for the machines to provision, let’s have a look at RabbitMQ clustering, and how we’ll define the job in Nomad.

RabbitMQ Cluster

A few things to note about clustering RabbitMQ:

  • All nodes must be listening on the same port for clustering (4369 by default)
  • The ERL_EPMD_PORT variable doesn’t work on rabbitmq < 3.7.9
  • The latest Docker image for rabbitmq is 3.7.8
  • The rabbit node names must be DNS resolvable
  • The RABBITMQ_ERLANG_COOKIE must have the same value

The rabbitmq_peer_discovery_consul plugin we will use is shipped with RabbitMQ by default but is disabled. The easiest way to get everything up and running is to create your own docker container, with the plugin enabled, and a small configuration file to set a few options:

FROM rabbitmq:management-alpine
COPY rabbitmq.conf /etc/rabbitmq
RUN rabbitmq-plugins enable --offline rabbitmq_peer_discovery_consul

The rabbitmq.conf only needs a few lines. For this demo, we are enabling the guest account, and allowing it access to the webui from anyware, which should not be done in a production!

loopback_users.guest = false
cluster_formation.peer_discovery_backend = rabbit_peer_discovery_consul
cluster_formation.consul.svc_addr_auto = true
docker build -t rabbitmq:consul .

An image built from this is also available on docker hub.

Once we have a custom container built, it’s a good idea to test that it actually works, before we start trying to get Nomad to run it. We’ll do this by creating a network in Docker so that all the containers can talk directly to each other on their pre-defined ports. Don’t forget to change CONSUL_HOST to your machine’s IP address!

docker network create rabbit
docker run -d --rm --name consul -p 8500:8500 consul

docker run -d --rm --name rabbit1 -h rabbit1 --network rabbit -p 30001:15672 -e RABBITMQ_ERLANG_COOKIE='rabbit' -e CONSUL_HOST='10.0.75.1' rabbitmq:consul
docker run -d --rm --name rabbit2 -h rabbit2 --network rabbit -p 30002:15672 -e RABBITMQ_ERLANG_COOKIE='rabbit' -e CONSUL_HOST='10.0.75.1' rabbitmq:consul
docker run -d --rm --name rabbit3 -h rabbit3 --network rabbit -p 30003:15672 -e RABBITMQ_ERLANG_COOKIE='rabbit' -e CONSUL_HOST='10.0.75.1' rabbitmq:consul

You can now visit http://localhost:30001 (or 30002 or 30003) and see that we have a successful cluster running. Once you’re happy with it, you can kill it all off (as we started the containers with the --rm flag, Docker will delete them for us when they stop):

docker stop rabbit1 rabbit2 rabbit3 consul
docker network rm rabbit

Nomad Rabbit Job

Now that we know our container clusters successfully, we can create a Job definition to do the same thing in Nomad. Nomad jobs are defined in HCL, a Json-like configuration language.

The jobs require a name, which datacentre it should run in, and what kind of job type it is. In this case, our job is called rabbit (imaginative I know), we’ll run it in dc1 (the default value Nomad starts with), and we’ll make this job be a service, as opposed to a batch or system job:

job "rabbit" {

  datacenters = ["dc1"]
  type = "service"

  group "cluster" {
    # tasks ...
  }
}

The group is used to hold a collection of tasks, and when allocating a job, Nomad will make sure that all tasks belonging to an instance of a group are on the same node.

So for example, if you had a 2 node Nomad cluster, and 3 instances of a group which contained 3 tasks (e.g. API, varnish, and nginx containers), Nomad might distribute the groups like so:

image of several nodes with groups of containers

Within the group, we can specify the number of instances we want with the count property, and we also specify that for both updates and migrations, only one group can be changed at a time. This means that if you decide to upgrade the container used by the job, Nomad won’t stop all instances at once, destroying your service’s availability!

We also specify that we want to use the health checks (defined later on) rather than the state of the task itself to determine what is healthy, and how long the task must be healthy for before we decide it’s actually healthy, and how long it has to achieve being healthy.

group "cluster" {
  count = 3

  update {
    max_parallel = 1
  }

  migrate {
    max_parallel = 1
    health_check = "checks"
    min_healthy_time = "5s"
    healthy_deadline = "30s"
  }
}

The task is our unit of work in Nomad. In this case, we are using the docker driver, but it also supports many other drivers including exec, rkt and lxc. We configure which image to use, and importantly that the hostname is the name from Nomad!

The port_map tells nomad which ports of the container we want to expose, and labels them. We can then refer to the ports by their labels in other parts of the configuration.

task "rabbit" {
  driver = "docker"

  config {
    image = "pondidum/rabbitmq:consul"
    hostname = "${attr.unique.hostname}"
    port_map {
      amqp = 5672
      ui = 15672
      epmd = 4369
      clustering = 25672
    }
  }
}

The env section is pretty self-explanatory; they are environment variables to pass to the container. As Consul is running on the Nomad host, we use the Nomad interpolation attribute to specify the IP of the current host, and we also set the RABBITMQ_ERLANG_COOKIE to a specific value. In a production environment, you should be setting this value to something unguessable, possibly using the Vault intergration in Nomad to fetch a token. We can also add other settings to pass to the container here, such as RABBITMQ_DEFAULT_USER and RABBITMQ_DEFAULT_PASS. As with the cookie generation, in a production-like environment, you’d probably want to use the Vault integration to pull the values for these variables.

env {
  RABBITMQ_ERLANG_COOKIE = "rabbitmq"
  CONSUL_HOST = "${attr.unique.network.ip-address}"
}

The resources section lets us constraints on things like CPU, Memory, IOPs, and Network. In our case, we are only specifying a set of ports to expose on the network, and that we want them to be bound to specific ports on the host:

resources {
  network {
    port "amqp" { static = 5672 }
    port "ui" { static = 15672 }
    port "epmd" { static = 4369 }
    port "clustering" { static = 25672 }
  }
}

We could select different ports to bind the container ports to, or leave out the static pair entirely to have Nomad map the ports to random unused ports on the host.

Finally, the service block integrates with service discovery (so, Consul), and allows us to register ports and health checks for our service. In the case of our RabbitMQ cluster, we already have service discovery integration via the RabbitMQ Consul plugin, so this registration is only used for the check feature, which is what will also be used by the migrate block to see if a task is healthy:

service {
  check {
    name     = "alive"
    type     = "tcp"
    port     = "ui"
    interval = "10s"
    timeout  = "2s"
  }
}

The check is using the ui port defined earlier to check if the UI is alive. We could also change the health check to use the amqp port instead, as that might be a better indication that the actual service can do useful things. We can define multiple checks, and are not limited to TCP; grpc, http, and script are also supported.

The entire job definition is below, and is also available in the repository.

job "rabbit" {

  datacenters = ["dc1"]
  type = "service"

  group "cluster" {
    count = 3

    update {
      max_parallel = 1
    }

    migrate {
      max_parallel = 1
      health_check = "checks"
      min_healthy_time = "5s"
      healthy_deadline = "30s"
    }

    task "rabbit" {
      driver = "docker"

      config {
        image = "pondidum/rabbitmq:consul"
        hostname = "${attr.unique.hostname}"
        port_map {
          amqp = 5672
          ui = 15672
          epmd = 4369
          clustering = 25672
        }
      }

      env {
        RABBITMQ_ERLANG_COOKIE = "rabbitmq"
        CONSUL_HOST = "${attr.unique.network.ip-address}"
      }

      resources {
        network {
          port "amqp" { static = 5672 }
          port "ui" { static = 15672 }
          port "epmd" { static = 4369 }
          port "clustering" { static = 25672 }
        }
      }

      service {
        name = "rabbitmq"
        port = "ui"
        check {
          name     = "alive"
          type     = "tcp"
          interval = "10s"
          timeout  = "2s"
        }
      }

    }
  }
}

Running The Job

First, make sure your console can talk to Nomad, which we can do by using the server_ip file again:

export NOMAD_ADDR="http://$(cat server_ip):4646"

Now it should be possible to run the job:

nomad job run rabbit/rabbit.nomad

After a few moments, we can visit any of the Nomad hosts, and log in to the RabbitMQ UI (http://SOME_SERVER_IP:15672) and you should be greeted with a running cluster with three nodes:

rabbitmq cluster dashboard

Homework

  • Kill a container on one of the nodes and see what happens (vagrant ssh n2 then docker stop <SOME_CONTAINER_ID>)
  • Create an Application which you deploy to Nomad which uses service discovery to talk to RabbitMQ
  • Create some more Nomad clients, and try making a bigger RabbitMQ cluster

If you have any questions, feel free to comment below or send me a tweet.

infrastructure, vagrant, nomad, consul, rabbitmq

---

Testing Immutable Infrastructure

01 Jan 2019

In my previous post, I glossed over one of the most important and useful parts of Immutable Infrastructure: Testability. There are many kinds of tests we can write for our infrastructure, but they should all be focused on the machine/service and maybe it’s nearest dependencies, not the entire system.

While this post focuses on testing a full machine (both locally in a VM, and remotely as an Amazon EC2 instance), it is also possible to do most of the same kind of tests against a Docker container. In fact, one of the tools used in this post supports building Docker containers as an output in parallel to the AMIs, so this can also assist in providing a migration path to/from Docker.

As an example, I will show how I built and tested a LogStash machine, including how to verify that the script to create the production machine is valid, that the machine itself has been provisioned correctly, and that the services inside work as expected.

I have published all the source code to GitHub. The examples in this post are all taken from the repository but might have a few bits removed just for readability. Check the full source out if you are interested!

Repository Structure and Tools

When it comes to building anything that you will have lots of, consistency is key to making it manageable. To that end, I have a small selection of tools that I use, and a repository structure I try and stick to. They are the following:

Vagrant - This is a tool for building and managing virtual machines. It can be backed by many different providers such as Docker, HyperV and VirtualBox. We’ll use this to build a local Linux machine to develop and test LogStash in. I use the HyperV provisioner, as that is what Docker For Windows also uses, and HyperV disables other virtualisation tools.

Packer - This tool provides a way to build machine images. Where Vagrant builds running machines, Packer builds the base images for you to boot, and can build multiple different ones (in parallel) from one configuration. We’ll use this to create our AMIs (Amazon Machine Images.)

Jest - This is a testing framework written in (and for) NodeJS applications. Whatever testing tool works best for your environment is what you should be using, but I use Jest as it introduces minimal dependencies, is cross-platform, and has some useful libraries for doing things like diffing json.

The repository structure is pretty simple:

  • scripts/
  • src/
  • test/
  • build.sh
  • logstash.json
  • package.json
  • vagrantfile

The src directory is where our application code will live. If the application is compiled, the output goes to the build directory (which is not tracked in source-control.) The test directory will contain all of our tests, and the scripts directory will contain everything needed for provisioning our machines.

We’ll describe what the use of each of these files is as we go through the next section.

Local Development

To create our virtual machine locally, we will use Vagrant. To tell Vagrant how to build our machine, we need to create a vagrantfile in our repository, which will contain the machine details and provisioning steps.

The machine itself has a name, CPU count, and memory specified. There is also a setting for Hyper-V which allows us to use a differencing disk, which reduces the startup time for the VM, and how much disk space it uses on the host machine.

For provisioning, we specify to run the relevant two files from the scripts directory.

Vagrant.configure("2") do |config|
    config.vm.box = "bento/ubuntu-16.04"

    config.vm.provider "hyperv" do |hv|
        hv.vmname = "LogStash"
        hv.cpus = 1
        hv.memory = 2048
        hv.linked_clone = true
    end

    config.vm.provision "shell", path: "./scripts/provision.sh"
    config.vm.provision "shell", path: "./scripts/vagrant.sh"
end

To keep things as similar as possible between our development machine and our output AMI, I keep as much of the setup script in one file: scripts/provision.sh. In the case of our LogStash setup, this means installing Java, LogStash, some LogStash plugins, and enabling the service on reboots:

#! /bin/bash

# add elastic's package repository
wget -qO - https://artifacts.elastic.co/GPG-KEY-elasticsearch | sudo apt-key add -
echo "deb https://artifacts.elastic.co/packages/6.x/apt stable main" | sudo tee -a /etc/apt/sources.list.d/elastic-6.x.list
sudo apt-get update

# install openjdk and set environment variable
sudo apt-get install openjdk-8-jre -y
JAVA=$(readlink -f $(which java) | sed "s:bin/java::")
echo "JAVA_HOME=$JAVA" | sudo tee --append /etc/environment

#install logstash and plugins
sudo apt-get install logstash -y
/usr/share/logstash/bin/logstash-plugin install logstash-filter-uuid
/usr/share/logstash/bin/logstash-plugin install logstash-filter-prune

sudo systemctl enable logstash.service

Vagrant will automatically mount it’s working directory into the VM under the path /vagrant. This means we can add a second provisioning script (scripts/vagrant.sh) to link the /vagrant/src directory to the LogStash configuration directory (/etc/logstash/conf.d), meaning we can edit the files on the host machine, and then restart LogStash to pick up the changes.

#! /bin/bash
sudo rm -rf /etc/logstash/conf.d
sudo ln -s /vagrant/src /etc/logstash/conf.d

sudo systemctl start logstash.service

Now that we have a vagrantfile, we can start the virtual machine with a single command. Note, Hyper-V requires administrator privileges, so you need to run this command in an admin terminal:

vagrant up

After a while, your new LogStash machine will be up and running. If you want to log into the machine and check files an processes etc., you can run the following command:

vagrant ssh

An argument can also be provided to the ssh command to be executed inside the VM, which is how I usually trigger LogStash restarts (as it doesn’t seem to detect when I save the config files in the src directory):

vagrant ssh -c 'sudo systemctl restart logstash'

Deployment

To create the deployable machine image, I use Packer. The process is very similar to how Vagrant is used: select a base AMI, create a new EC2 machine, provision it, and save the result as a new AMI.

Packer is configured with a single json file, in this case, named logstash.json. The file is split into four parts: variables, builders, provisioners, and outputs. I won’t include the outputs section as it’s not needed when building AMIs.

Variables

The variables property is for all configuration that you can pass to Packer. Their values can come from Environment Variables, CLI parameters, Consul, Vault, and others. In the LogStash example, there are three variables:

{
  "variables": {
    "aws_access_key": "",
    "aws_secret_key": "",
    "ami_users": "{{env `AMI_ACCOUNTS`}}"
  }
}

The aws_access_key and aws_secret_key are known names - unless we specify some value, they will automatically be read from your AWS config (in ~/.aws/), or if running on EC2, from the EC2 machine profile.

The ami_users is a custom variable which will read the AMI_ACCOUNTS environment variable by default. This particular one is used so that I can grant access to the resulting AMI to multiple AWS accounts, which is useful if you’re running in an Organisation with multiple Accounts. For example, if the AMI is built in a common account, and will be deployed into dev, qa and prod accounts, then you would populate the AMI_ACCOUNTS as a CSV of account IDs.

Builders

Packer can build many different kinds of machine image, but for this, we only need one: amazon-ebs.

{
  "builders": [
    {
      "type": "amazon-ebs",
      "access_key": "{{user `aws_access_key`}}",
      "secret_key": "{{user `aws_secret_key`}}",
      "region": "eu-west-1",
      "instance_type": "t2.micro",
      "source_ami_filter": {
        "filters": {
          "virtualization-type": "hvm",
          "name": "ubuntu/images/*ubuntu-xenial-16.04-amd64-server-*",
          "root-device-type": "ebs"
        },
        "owners": ["099720109477"],
        "most_recent": true
      },
      "ssh_username": "ubuntu",
      "ami_name": "logstash {{timestamp}}",
      "ami_users": "{{user `ami_users`}}"
    },
  ]
}

The two most interesting properties of this are source_ami_filter and ami_users. The source_ami_filter works in a very similar manner to the AWS CLI’s describe-images --filters parameter, albeit in a more readable format. In this case, I am specifying that I want an ubuntu-xenial base, and I want it to be an official Canonical image, so specify their Account ID as the owner. I also specify the most_recent property, as this filter will return all versions of this AMI which Canonical publish.

The ami_users is what lets me grant access to the AMI from other accounts (rather than just making it public). The property’s value should be an array, but Packer is smart enough to expand the CSV in the user variable into an array for us.

Provisioners

The provisioners array items are executed in the order they are specified. To set up the machine, I use the shell provisioner to create a temporary directory, then the file provisioner to upload the files in the src directory to that temporary directory. Finally a second shell provisioner uploads and runs the scripts/provision.sh and scripts/aws.sh files.

{
  "provisioners": [
    {
      "type": "shell",
      "inline": "mkdir -p /tmp/src"
    },
    {
      "type": "file",
      "source": "./src/",
      "destination": "/tmp/src"
    },
    {
      "type": "shell",
      "scripts": ["./scripts/provision.sh", "./scripts/aws.sh"]
    }
  ]
}

The aws.sh file is very small and does roughly the same thing as the vagrant.sh script, but rather than symlinking the /vagrant directory, it moves the uploaded src directory into the right location for LogStash:

#! /bin/sh

sudo rm /etc/logstash/conf.d/*
sudo cp -r /tmp/src/* /etc/logstash/conf.d

Note that this doesn’t start the LogStash service - this gets done by the UserData when we launch a new instance, as often we need to pass in additional configuration parameters, and don’t want the service running until that has been done.

Running

To create the AMI, we need to invoke packer. If I am running packer on a remote machine via SSH, I run it inside tmux, so that disconnects don’t fail the process:

packer build -var "ami_users=111,222,333" logstash.json

After a while, Packer will finish, leaving you with an output which will include the new AMI ID:

==> Builds finished. The artifacts of successful builds are:
--> amazon-ebs: AMIs were created:

eu-west-1: ami-123123123

We’ll get back to this output later when we create a build script that will also run our tests. Before we get to that, however, let’s look at how we can write tests which target both the local Vagrant machine and the AMI too.

Testing

To test the machines, I am using Jest. There isn’t anything particularly interesting going on in the package.json, other than a few babel packages being installed so that I can use ES6 syntax:

{
  "scripts": {
    "watch": "jest --watch",
    "test": "jest "
  },
  "devDependencies": {
    "babel-core": "^6.26.3",
    "babel-jest": "^23.6.0",
    "babel-preset-env": "^1.7.0",
    "jest": "^23.6.0",
    "regenerator-runtime": "^0.13.1"
  }
}

Packer Configuration Testing

There are a number of tests we can do to make sure our Packer configuration is valid before running it. This includes things like checking the base AMI is from a whitelisted source (such as our accounts, Amazon and Canonical). The test has to handle the possibility of multiple builders, and that some builders might not have a source_ami_filter. It also handles if no owner has been specified at all, which we also consider a “bad thing”:

const ourAccounts = [ "111111", "222222", "333333", "444444" ];
const otherOwners = [ "amazon", "099720109477" /*canonical*/ ];

describe("ami builder", () => {

  it("should be based on a whitelisted owner", () => {
    const allOwners = ourAccounts.concat(otherOwners);
    const invalidOwners = owners => owners.filter(owner => !allOwners.includes(owner));

    const amisWithInvalidOwners = packer.builders
      .filter(builder => builder.source_ami_filter)
      .map(builder => ({
        name: builderName(builder),
        invalidOwners: invalidOwners(builder.source_ami_filter.owners || [ "NO OWNER SPECIFIED" ])
      }))
      .filter(builders => builders.invalidOwners.length > 0);

    expect(amisWithInvalidOwners).toEqual([]);
  });

});

I also test that certain variables (ami_users) have been defined, and have been used in the right place:

describe("variables", () => {
  it("should have a variable for who can use the ami", () => {
    expect(packer.variables).toHaveProperty("ami_users");
  });

  it("should read ami_users from AMI_ACCOUNTS", () => {
    expect(packer.variables.ami_users).toMatch(
      /{{\s*env\s*`AMI_ACCOUNTS`\s*}}/
    );
  });
});

describe("ami builder", () => {
  it("should set the ami_user", () => {

    const invalidUsers = packer.builders
      .map(builder => ({
        name: builderName(builder),
        users: builder.ami_users || "NO USERS SPECIFIED"
      }))
      .filter(ami => !ami.users.match(/{{\s*user\s*`ami_users`\s*}}/));

    expect(invalidUsers).toEqual([]);
  });
})

Other tests you might want to add are that the base AMI is under a certain age, or that your AMI has certain tags included, or that it is named in a specific manner.

Machine Testing

Machine testing is for checking that our provisioning worked successfully. This is very useful, as subtle bugs can creep in when you don’t verify what happens.

For example, a machine I built copied configuration directory to a target location but was missing the -r flag, so when I later added a subdirectory, the machine failed as the referenced files didn’t exist.

So that the tests work with both the Vagrant and Packer built versions, we take in their address and key paths from the environment:

import { spawnSync } from "child_process";
import { createConnection } from "net";

// figure out where to look these up
const host = process.env.LOGSTASH_ADDRESS; // e.g. "172.27.48.28";
const keyPath = process.env.LOGSTASH_KEYPATH; // ".vagrant/machines/default/hyperv/private_key";

We also define two helper methods: one to check if a TCP port is open, and one which uses SSH to execute a command and read the response in the machine:

const execute = command => {
  const args = [`[email protected]${host}`, `-i`, keyPath, command];
  const ssh = spawnSync("ssh", args, { encoding: "utf8" });
  const lines = ssh.stdout.split("\n");

  if (lines[lines.length - 1] === "") {
    return lines.slice(0, lines.length - 1);
  }
  return lines;
};

const testPort = port => new Promise((resolve, reject) => {
  const client = createConnection({ host: host, port: port });

  client.on("error", err => reject(err));
  client.on("connect", () => {
    client.end();
    resolve();
  });
});

We can then add some tests which check the files were written to the right place, that port 5044 is open, and port 9600 is closed:

describe("the machine", () => {

  it("should have the correct configuration", () => {
    const files = execute("find /etc/logstash/conf.d/* -type f");

    expect(files).toEqual([
      "/etc/logstash/conf.d/beats.conf",
      "/etc/logstash/conf.d/patterns/custom.txt"
    ]);
  });

  it("should be listening on 5044 for beats", () => testPort(5044));
  it("should not be listening on 9600", () => expect(testPort(9600)).rejects.toThrow("ECONNREFUSED"));
});

Of course, as we can execute any command inside the machine, we can check pretty much anything:

  • tail the LogStash log and see if it’s got the right contents
  • check if the service is started
  • check the service is enabled on boot
  • check the environment variables been written to the right files

Application Testing

There are two styles of Application Testing: white-box and black-box. White-box will be tests run on the application inside the machine, using minimal external dependencies (preferably none at all), and Black-box will be run on the application from outside the machine, either using direct dependencies, or fakes.

It’s worth noting that both white-box and black-box tests are slow, mostly down to how slow LogStash is at starting up, although only giving it 1 CPU and 2Gb of RAM probably doesn’t help.

Whitebox Testing LogStash

To white-box test LogStash, I use a technique partially based on the Agolo LogStash Test Runner. The process for the tests is to run LogStash interactively (rather than as a service), send it a single event, record the output events, and compare them to an expected output.

The test cases are kept in separate folders, with two files. First is the input file, imaginatively called input.log, which will contain one json encoded event per line. The format needs to match what the result of FileBeat sending an event to LogStash would be. In this case, it means a few extra fields, and a message property containing a string of json. Formatted for readability, the object looks like this:

{
  "@timestamp": "2018-12-27T14:08:24.753Z",
  "beat": { "hostname": "Spectre", "name": "Spectre", "version": "5.3.0" },
  "fields": { "environment": "local", "log_type": "application" },
  "input_type": "log",
  "message": "{\"Timestamp\": \"2018-12-18T17:06:27.7112297+02:00\",\"Level\": \"Information\",\"MessageTemplate\": \"This is the {count} message\",\"Properties\": {\"count\": 4,\"SourceContext\": \"LogLines.GetOpenPurchasesHandler\",\"ApplicationName\": \"FileBeatTest\",\"CorrelationId\": \"8f341e8e-6b9c-4ebf-816d-d89c014bad90\",\"TimedOperationElapsedInMs\": 1000}}",
  "offset": 318,
  "source": "D:\\tmp\\logs\\single.log",
  "type": "applicationlog"
}

I also define an output.log, which contains the expected result(s), again one json encoded event per line. The example pipeline in the repository will emit two events for a given input, so this file contains two lines of json (again, newlines added for readability here):

{
  "source": "D:\\tmp\\logs\\single.log",
  "@version": "1",
  "fields": { "log_type": "application", "environment": "local" },
  "@timestamp": "2018-12-18T15:06:27.711Z",
  "offset": 318,
  "ApplicationName": "FileBeatTest",
  "host": "ubuntu-16",
  "type": "applicationlog",
  "CorrelationId": "8f341e8e-6b9c-4ebf-816d-d89c014bad90",
  "MessageTemplate": "This is the {count} message",
  "Level": "Information",
  "Context": "LogLines.GetOpenPurchasesHandler",
  "TimeElapsed": 1000,
  "Properties": { "count": 4 }
}
{
  "duration": 1000000,
  "timestamp": 1545145586711000,
  "id": "<generated>",
  "traceid": "8f341e8e6b9c4ebf816dd89c014bad90",
  "name": "LogLines.GetOpenPurchasesHandler",
  "localEndpoint": { "serviceName": "FileBeatTest" }
}

To enable sending the lines directly to LogStash (rather than needing to use FileBeat), we define an input.conf file, which configures LogStash to read json from stdin:

input {
  stdin { codec => "json_lines" }
}

And an ouput.conf file which configures LogStash to write the output as json lines a known file path:

output {
  file {
    path => "/tmp/test/output.log"
    codec => "json_lines"
  }
}

The tests need to be run inside the machine itself, so I created a script in the ./scripts directory which will do all the work, and can be run by the execute method in a Jest test. The script stops the LogStash service, copies the current configuration from the ./src directory and the replacement input.conf and output.conf files to a temporary location, and then runs LogStash once per test case, copying the result file to the test case’s directory.

#! /bin/bash

sudo systemctl stop logstash

temp_path="/tmp/test"
test_source="/vagrant/test/acceptance"

sudo rm -rf "$temp_path/*"
sudo mkdir -p $temp_path
sudo cp -r /vagrant/src/* $temp_path
sudo cp $test_source/*.conf $temp_path

find $test_source/* -type d | while read test_path; do
    echo "Running $(basename $test_path) tests..."

    sudo /usr/share/logstash/bin/logstash \
        "--path.settings" "/etc/logstash" \
        "--path.config" "$temp_path" \
        < "$test_path/input.log"

    sudo touch "$temp_path/output.log"   # create it if it doesn't exist (dropped logs etc.)
    sudo rm -f "$test_path/result.log"
    sudo mv "$temp_path/output.log" "$test_path/result.log"

    echo "$(basename $test_path) tests done"
done

sudo systemctl start logstash

To execute this, we use the beforeAll function to run it once - we also pass in Number.MAX_SAFE_INTEGER as by default beforeAll will time out after 5 seconds, and the test.sh is slow as hell (as LogStash takes ages to start up).

Once the test.sh script has finished running, we load each test’s output.log and result.log files, parse each line as json, compare the objects, and print out the delta if the objects are not considered equal:

const source = "./test/acceptance";
const isDirectory = p => fs.lstatSync(p).isDirectory();

const cases = fs
  .readdirSync(source)
  .map(name => path.join(source, name))
  .filter(isDirectory);

describe("logstash", () => {
  beforeAll(
    () => execute("/vagrant/scripts/test.sh"),
    Number.MAX_SAFE_INTEGER);

  test.each(cases)("%s", directoryPath => {
    const expected = readFile(path.join(directoryPath, "output.log"));
    const actual = readFile(path.join(directoryPath, "result.log"));

    const diffpatch = new DiffPatcher({
      propertyFilter: (name, context) => {
        if (name !== "id") {
          return true;
        }

        return context.left.id !== "<generated>";
      }
    });

    const delta = diffpatch.diff(expected, actual);
    const output = formatters.console.format(delta);

    if (output.length) {
      console.log(output);
    }

    expect(output.length).toBe(0);
  });
});

Blackbox Testing LogStash

As the machine has ports open for FileBeat and will send it’s output to ElasticSearch, we can set up a fake HTTP server, send some log events via FileBeat to the VM and check we receive the right HTTP calls to our fake server.

While looking on how to do this, I came across the lumberjack-protocol package on NPM, but unfortunately, it only supports lumberjack v1, and FileBeat and LogStash are now using v2, so you would have to use a local copy of filebeat to do the sending.

Due to the complexity of implementing this, and the diminished return on investment (the other tests should be sufficient), I have skipped creating the Blackbox tests for the time being.

AMI Testing

The final phase! Now that we are reasonably sure everything works locally, we need to build our AMI and test that everything works there too, as it would be a shame to update an Auto Scale Group with the new image which doesn’t work!

All that needs to happen to run the tests against an EC2 instance is to set the three environment variables we used with Vagrant, to values for communicating with the EC2 instance. To do this, we’ll need the EC2 IP Address, the username for SSH, and the private key for SSH authentication.

The first thing our build script needs to do is create the AMI. This is done in the same way as mentioned earlier, but with the slight difference of also piping the output to tee:

packer_log=$(packer build logstash.json | tee /dev/tty)
ami_id=$(echo "$packer_log" | tail -n 1 | sed 's/.*\(ami.*\)/\1/')

By using tee, we can pipe the build log from Packer to both the real terminal (/dev/tty), and to a variable called packer_log. The script then takes the last line and uses some regex to grab the AMI ID.

Next up, the script uses the AWS CLI to launch an EC2 instance based on the AMI, and store it’s IP Address and Instance ID:

json=$(aws ec2 run-instances \
  --image-id "$ami_id" \
  --instance-type t2.small \
  --key-name "$keypair_name" \
  --region eu-west-1 \
  --subnet-id "$subnet_id" \
  --security-group-ids "$security_group_id" \
  --tag-specifications 'ResourceType=instance,Tags=[{Key=Name,Value=logstash-verification}]' \
  --user-data "$userdata")

instance_id=$(echo "$json" | jq -r .Instances[0].InstanceId)
private_ip=$(echo "$json" | jq -r .Instances[0].PrivateIpAddress)

The IP Address is then used to set up the environment variables which the node test scripts use to locate the machine:

LOGSTASH_ADDRESS="$private_ip"
LOGSTASH_SSH="ubuntu"
LOGSTASH_KEYPATH="~/.ssh/id_rsa" build ou

npm run test

Finally, the script uses the Instance ID to terminate the instance:

aws ec2 terminate-instances \
  --instance-ids "$instance_id"

Wrapping Up

Hopefully, this (rather long) post is a useful introduction (!) to how I tackle testing Immutable Infrastructure. All of these techniques for testing the machine and application can be used for testing things like Docker containers too (and handily, Packer can be used to create Docker containers also).

As mentioned earlier The Repository is available here.

logstash, microservices, infrastructure, vagrant, packer, aws, testing

---

Code-free tracing with LogStash and Jaeger

22 Dec 2018

I wanted to show request charts (similar to the network tab in firefox) for requests across our microservices but wanted to do so in the least invasive way possible.

Firefox network timeline

We already use LogStash to collect logs from multiple hosts (via FileBeat) and forward them on to ElasticSearch, so perhaps I can do something to also output from LogStash to a tracing service.

app to filebeat to logstash to elasticsearch and opentracing

There are a number of tracing services available (AppDash, Jaeger, Zipkin), but unfortunately LogStash doesn’t have plugins for any of them or for OpenTracing. I picked Jaeger, as it seemed to be the easiest to get up and running, and also had a “Zipkin compatible” API too if you didn’t want to use the Jaeger one.

Setting up and running Jaeger in a production environment is way out of scope for this post, but for testing, I am just using the all-in-one Docker container:

docker run -d --nameron jaeger \
  -e COLLECTOR_ZIPKIN_HTTP_PORT=9411 \
  -p 5775:5775/udp \
  -p 6831:6831/udp \
  -p 6832:6832/udp \
  -p 5778:5778 \
  -p 16686:16686 \
  -p 14268:14268 \
  -p 9411:9411 \
  jaegertracing/all-in-one

The tracing APIs work in roughly the same way: all traces have the same traceid, and are sub-divided into spans. Spans have a unique id and can have a parent (span) id set. The trouble with sending data from LogStash is that I don’t have any parent IDs available. Well, at least without going and adding it to all our applications, and I’m not sure how I would want to propagate it correctly either.

So the first question is: can I push multiple spans without any parent IDs, and have it plot them as I would expect.

Is span.parentid required?

As the HTTP API documentation seems to be non-existent (other than “there is one!”), and most of the other docs mention the Thrift API, I decided to use that to test the theory. The Jaeger C# client library has a nice API to use, but I wanted to manually specify things (e.g. traceid, spanid, etc.) and the API has no way to do that, as pretty much everything is marked as internal.

Instead, I cloned the client repository, and added a test (which had access to all the internals I wanted):

public class Scratch
{
    private readonly HttpSender _sender;
    private readonly Tracer _tracer;
    private readonly TraceId _id;

    public Scratch(ITestOutputHelper output)
    {
        _sender = new HttpSender("http://localhost:14268/api/traces");
        _tracer = new Tracer.Builder("FirstService").Build();
        _id = TraceId.NewUniqueId(false);

        output.WriteLine(_id.ToString());
    }

    [Fact]
    public async Task Send()
    {
        var start = DateTime.UtcNow;

        await Create(start, start.AddSeconds(5));
        await Create(start, start.AddSeconds(3));

        await Create(start.AddSeconds(5), start.AddSeconds(10));
        await Create(start.AddSeconds(10), start.AddSeconds(20));
        await Create(start, start.AddSeconds(20));

        await _sender.FlushAsync(CancellationToken.None);
    }

    private async Task Create(DateTime start, DateTimeOffset finish)
    {
        var context = new SpanContext(_id, SpanId.NewUniqueId(), new SpanId(0), SpanContextFlags.None);
        var span = new Span(_tracer, "test", context, start, new Dictionary<string, object>(), null);
        span.Finish(finish);

        await _sender.AppendAsync(span, CancellationToken.None);
    }
}

The graph rendered how I hoped, proving that I don’t need to set the span’s parentid for everything with the same traceid to work correctly!

How to generate a consistent TraceID?

All our services pass along or generate a correlationID for each request, so my idea here was to do some conversion to put it in the traceid, which is an int64. Our correlationIDs are guids, which internally is an int128, so I figured I would take the least significant 8 bytes and use that. This can be done in C# like so:

var correlationID = Guid.NewGuid();
var traceID = new TraceId(BitConverter.ToInt64(correlationID.ToByteArray(), 8));

I noticed, however, that the TraceID structure in the Jaeger client can take two values: int64 low and int64 high, so I can use the entire correlationid.

How to ship from LogStash?

After reading through the LogStash Output Plugins docs it looked like there were three possible ways to ship the events to Jaeger: writing a command line app; invoked through either exec or pipe, writing a LogStash plugin, or doing something with the http output.

CLI adaptor

The idea of this method would be to write a CLI app which takes an event from LogStash and then sends it to Jaeger using the Thrift API. The app itself could be invoked using the exec or pipe outputs. I didn’t want to use this method for a few reasons:

Development: The Jaeger client has things I need hidden from the public surface, such as being able to specify the traceid, rather than generating a new one each time, so I would either have to reimplement the classes I need, use reflection to instantiate things, or try opening Pull Requests to expose more internals of the library.

Maintenance: I’d rather not have yet another thing to maintain if I can avoid it, not to mention all the additional things like repository location, build agents, etc.

Deployment: Our LogStash box is a plain Ubuntu box with nothing other than Java and LogStash installed on it. As I would probably end up using C# to write the app, I’d need to add even more dependencies.

Throughput: Jaeger (and the others) support sending traces in batches for better efficiency. Writing an app which supported that would involve writing the batching and caching, and the likelihood is I would mess it up, and increase the chances of lost messages.

LogStash plugin

Basically has the same problems as above, with the addition of me needing to learn enough about Java to build something reliable, and as this is a test and no one else in my team really knows Java either, so I ruled this out.

Http?

The Jaeger collector process can also expose a Zipkin compatible API, and while the documentation for that API is also terrible, I was able to find a Pull Request on Github with some structure defined. I figured I could do most of this directly with LogStash’s configuration DSL, assuming I can: generate a uuid (logstash-filter-uuid plugin), and remove all extra stuff from the post body (logstash-filter-prune).

Building the LogStash message

The first step of my configuration is based on what we need to do to Serilog messages to make them a bit nicer in Kibana:

Pre-processing

json {
    source => "message"
}

date {
    match => [ "Timestamp", "ISO8601", "yyyy-MM-dd'T'HH:mm:ss.SSSSSSSZZ"]
    target => "@timestamp"
}

mutate {
    rename => {
        "[Properties][ApplicationName]" => "ApplicationName"
        "[Properties][CorrelationId]" => "CorrelationId"
        "[Properties][SourceContext]" => "Context"
        "[Properties][TimedOperationElapsedInMs]" => "TimeElapsed"
    }
}

The json block tells LogStash that everything in the message property is a json object, rather than a string.

The date block extracts the time the message was recorded by the application and writes it to the special @timestamp field. This is a good idea as due to queuing, and processing times, the time the message was written and the time it was processed by LogStash will be different.

Lastly, the mutate block is moving a bunch of fields out of the message properties, and up to the top level of the object. We query by things like ApplicationName a lot, and it’s nice not to have to write Properties.ApplicationName:blahblah all the time.

Splitting

Our messages contain a lot of data which for the purposes of tracing, we just don’t care about. Not only is it a waste of bandwidth to send all the data to Jaeger, but also there could be clashing fields, and if we ever decide to use a hosted version of Jaeger, I don’t want to be sending them all the data either.

So how do we remove everything from the message, but still keep it in the message for the other outputs?

LogStash has an interesting filter called clone. What it does is copy the message as many times as you specify, and then runs the remaining part of the pipeline once per message, including the original message. In code, this just looks like this:

clone {
    clones => [ "jaeger" ]
}

If we wanted to split the message more times, we can just add more types to the clones array; there will be one extra message produced for each entry in the array, which will also be added to the message as the type property so we can filter on it later.

We don’t want to send the type field itself to Jaeger (or ElasticSeach), so we move it from the message to the @metadata object, which doesn’t get passed to outputs.

mutate {
    rename => { "type" => "[@metadata][type]" }
}

To handle the messages differently, we can just use an if statement on the @metadata.type property:

if [@metadata][type] == "jaeger" {
    # do stuff to the jaeger message
} else {
    # process the original message
}

Building the Http Payload

The hardest part of getting this to work was handling the dates. Jaeger specifies that timestamps should come in microseconds since epoch, and duration should be in microseconds also. We need to use the ruby filter so we can run some code to do this, rather than using an inbuilt filter primitive.

The duration is easy enough, as we report TimeElapsed in milliseconds:

ruby {
    code => "event.set('duration', (event.get('TimeElapsed') * 1000).floor)"
}

To handle the timestamp we need to convert it to epoch milliseconds. We can get the fractional seconds since the epoch by calling to_f on the @timestamp field, and then multiply by 1000 a couple of times to express it as microseconds. We also floor it to remove the leftover fraction:

ruby {
    code => "event.set('timestamp', (event.get('@timestamp').to_f * 1000 * 1000).floor)"
}

This on its own will work, but it won’t be right. We report log entries when an operation completes, which means our timestamp is the finish of the event, not the start. To solve this, we subtract the duration off the timestamp:

ruby {
    code => "event.set('timestamp', (event.get('@timestamp').to_f * 1000 * 1000).floor - event.get('duration'))"
}

Next, we need to create our span id. We need to generate a random id, consisting of letters and numbers, so the easiest way is to generate a uuid, and then tweak it to fit the Jaeger/Zipkin format. To do that we need to install the logstash-filter-uuid plugin:

/usr/share/logstash/bin/logstash-plugin install logstash-filter-uuid

There are a few things we need to do to generate a valid span id (for Jaeger):

  1. generate a uuid
  2. remove the dashes
  3. make it lowercase
  4. trim it to 16 chars

We use four filters to achieve this:

uuid {
    target => "id"
}
mutate {
    gsub => [ "id", "-", "" ]
    lowercase => [ "id" ]
}
mutate {
    gsub => [ "id", "(^.{1,16}).*$", "\1"]
}

The reason we specify the mutate twice is that order of the same operation (gsub in this case) is not guaranteed within a mutate block.

Next up we need to tweak the traceid from a guid to a uuid:

mutate {
    gsub => [ "traceid", "-", "" ]
    lowercase => [ "traceid" ]
}

Next, we can populate some other context into the message, by renaming the other fields we want. I discovered most of the fields and their formats from a combination of the thrift dtos and the this pull request.

mutate {
    rename => {
        "CorrelationId" => "traceid"
        "Context" => "name"
        "ApplicationName" => "[localEndpoint][serviceName]"
    }
}

Finally, we want to trim all the other properties from our message, which we can do by using the logstash-filter-prune plugin:

prune {
    whitelist_names => [ "id", "traceid", "name", "timestamp", "duration", "localEndpoint" ]
}
mutate {
    remove_field => [ "@timestamp" ]
}

We have to explicitly remove the @timestamp property as the prune plugin seems to ignore it (it also ignores the @metadata property).

The complete filter code can be made a little more efficient by combining several of the mutate blocks, provided we keep things in the right order, as follows:

ruby {
    code => "event.set('duration', (event.get('TimeElapsed') * 1000).floor)"
}

ruby {
    code => "event.set('timestamp', (event.get('@timestamp').to_f * 1000 * 1000).floor - event.get('duration'))"
}

uuid {
    target => "id"
}

mutate {
    rename => {
        "CorrelationId" => "traceid"
        "Context" => "name"
        "ApplicationName" => "[localEndpoint][serviceName]"
    }
    remove_field => [ "@timestamp" ]
}

mutate {
    gsub => [ "id", "-", "" ]
    gsub => [ "traceid", "-", "" ]
    lowercase => [ "traceid", "id" ]
}

mutate {
    gsub => [ "id", "(^.{1,16}).*$", "\1"]
}

prune {
    whitelist_names => [ "id", "traceid", "name", "timestamp", "duration", "localEndpoint" ]
}

Publishing to Jaeger

When publishing to the Jaeger API, you have to send it an array of json objects. To achieve this in the http output plugin, we can enable the format => json_batch, which will cause the HTTP client to buffer some events and then send them all at once. So more efficiency, and no extra code on our part.

To help make testing easier, I publish to two different HTTP endpoints: the Jaeger docker container running on my host machine, and to a Request Logger Bin; where I can inspect the response if something is reported as a failure by Jaeger, and replay it (via the Insomnia REST client) and check error messages returned from Jaeger.

output {
    if [@metadata][type] == "jaeger" {
        http {
            url => "http://192.168.0.115:9411/api/v2/spans"
            format => "json_batch"
            http_method => "post"
        }

        http {
            url => "http://requestloggerbin.herokuapp.com/bin/ff3da5b2-ec57-4611-ada8-b360ab38830c"
            format => "json_batch"
            http_method => "post"
        }
    } else {
        # normal message handling: elasticsearch etc.
        stdout { codec => rubydebug }
    }
}

I also used LogStash’s log output by tailing the log file:

tail -f /var/log/logstash/logstash-plain.log

Eventually, I had a working LogStash configuration and can view application traces in Jaeger.

Problems

One thing I cannot figure out how to do is sending the tags property to Jaeger. The Zipkin (and OpenTracing) APIs specify tags to be an object, e.g.

{
  "traceid": "1ee98d32185e41faba146c151ce8e27d",
  "id": "8377a6dae87947d7",
  "timestamp": 1545321238784000,
  "duration": 409000,
  "localEndpoint": { "serviceName": "PersonInformationService" },
  "name": "SomeOperation",
  "tags": {
    "CorrelationID": "1ee98d32-185e-41fa-ba14-6c151ce8e27d",
    "db.name": "test-blahblah"
  }
}

But LogStash wants the tags property to be an array of strings, and even using the remove_field on tags and then manually populating it doesn’t work. Currently, I don’t need anything in the tags, but it would be nice to have it working. I should ask on the Elastic forums perhaps.

Wrapping Up

This post turned out to be a lot longer than I was expecting, but hopefully, it was interesting. Next time I’ll try and go through how I do testing of immutable infrastructure, as well as local development of the base images.

jaeger, logstash, microservices, infrastructure

---

Against SemVer

16 Dec 2018

Well, for Applications & Services at least. For libraries, SemVer is the way to go, assuming you can agree on what a breaking change is defined as.

But when it comes to Applications (or SaaS products, websites, etc.) SemVer starts to break down. The problem starts with the most obvious: What is a breaking change? How about a minor change?

What’s in a change?

For example, if we were to change the UI of a web application, which caused no backend changes, from the user perspective it is probably a breaking change, but not from the developers perspective. What about changing a backend process, for example, the way the service is billed? How about adding a new step in the middle of an existing process?

These are all hard questions to answer, and I imagine there are many many edge cases and things which are not clear as to what level of change they are.

Clash of the Versions

The next problem stems from the fact that we don’t (often) do Trunk Based Development for applications. We have a long-lived (1-2 weeks) feature branch, which might get pushed to a test environment multiple times as tasks are completed. If we SemVer these deployments, when a bug fix happens on the master branch, we can end up with a version clash, and can’t deploy.

branching, showing a clash of SemVer by having concurrent branches

While this is a problem, we can solve it easily - we can use the -pre or -beta suffix for the feature branch, and then remove the suffix and increment the version number when the feature is fully deployed. This, however, is adding a little more complexity to the process - mostly on the human side of things this time.

I would rather avoid the complexity (machine and human) entirely so instead opt for a different solution: Dates.

How about a Date?

Date stamps to the rescue! Our builds now use the following format:

<year>.<month>.<day>.<build_number>

We did consider using the time (represented as .hhmm) instead of the build_number, but it would have been possible to clash if two builds triggered at the same minute, and the build_number is guaranteed uniqueness. By using an automatic format, we gain a few advantages:

  • No confusion on whether a change is a major, minor, or patch
  • No clashes on multiple branches
  • No human input required (marking things as pre or beta)
  • Answers the “when was this built” question

What about the libraries?

The libraries have a very different set of requirements, and one versioning scheme doesn’t seem to fit both very well, so there is no point trying to force the matter.

code, ci, cd

---

Stopping Caring...

08 Dec 2018

…about GitHub open source commit streak.

This is, I think, partially triggered by Marc Gravell’s post. I currently have had a GitHub commit streak going on 1878 days. The other night I realised, that I don’t care about it any more, and more so, I’m not sure why I did to start with.

I didn’t even mean to start doing it. I just noticed one day that I had done something every day for a couple of weeks, and vaguely wondered how long I could keep that up for. It turns out the answer is “quite a while”. For the most part, it was easy to do - just a small commit a day, it only takes a few minutes right? But more and more often, I noticed that it was having a negative effect on what I was writing.

For instance, I wanted to learn about Nomad, but had been putting it off, as it wouldn’t involve making any public commits. So each time I came to start, I couldn’t until I had done a public commit, and once that commit had was done, so was my motivation. Which meant learning Nomad didn’t happen for a long time.

It also affected what I did with old repositories on GitHub - I wanted to remove some which serve no useful purpose any more, but I couldn’t do that either, as that would remove their commits from the streak. Likewise, I have a new version of my TwelveFactor repository, but I really should remove the old version. But doing so would remove the commits from the streak too. So, as you might have guessed, I haven’t done that either. Maybe soon. I have a lot of writing to go with it too.

On the subject of writing, that also suffered. It takes a lot of effort to sit down and start writing a post, but usually, the words flow pretty smoothly once I am going. But knowing I need to spare some energy for the commit of the day makes it even harder to start writing, so I just haven’t been. There are a lot of drafts with bullet-point lists waiting though, so maybe they will begin to see the light of day soon.

The irony is not lost on me that making this post will increase the commit count (GitHub pages based blog), so I am not committing (or finishing) this post until at least a day has passed.

I decided on the 7th December 2018 not to make any commits. After that had happened - the act of deciding mind you, not the day itself, I felt a lot better. I do not need to do anything. I can go and finish the book a friend recommended me. I can spend my evenings learning about things I want to. Or just do nothing some evenings.

No more worrying about public commits. It’s just not worth it.

code, books, health

---