Running a Secure RabbitMQ Cluster in Nomad

06 Apr 2019

Last time I wrote about running a RabbitMQ cluster in Nomad, one of the main pieces of feedback I received was about the (lack) of security of the setup, so I decided to revisit this, and write about how to launch as secure RabbitMQ node in Nomad.

The things I want to cover are:

  • Username and Password for the management UI
  • Secure value for the Erlang Cookie
  • SSL for Management and AMQP

As usual, the demo repository with all the code is available if you’d rather just jump into that.

Configure Nomad To Integrate With Vault

To manage the certificates and credentials I will use another Hashicorp tool called Vault, which provides Secrets As A Service. It can be configured for High Availability, but for the demo, we will just use a single instance on one of our Nomad machines.

Vault

We’ll update the Vagrant script used in the last post about Nomad Rabbitmq Clustering to add in a single Vault node. This is not suitable for using Vault in production; for that there should be a separate Vault cluster running somewhere, but as this post is focusing on how to integrate with Vault, a single node will suffice.

Once we have Vault installed (see the provision.sh script), we need to set up a few parts. First is a PKI (public key infrastructure), better known as a Certificate Authority (CA). We will generate a single root certificate and have our client machines (and optionally the host machine) trust that one certificate.

As this the machines are running in Hyper-V with the Default Switch, we can use the inbuilt domain name, mshome.net, and provide our own certificates. This script is run as part of the Server (nomad1) provisioning script, but in a production environment would be outside of this scope.

domain="mshome.net"
vault secrets enable pki
vault secrets tune -max-lease-ttl=87600h pki

vault write -field=certificate pki/root/generate/internal common_name="$domain" ttl=87600h \
    > /vagrant/vault/mshome.crt

vault write pki/config/urls \
    issuing_certificates="$VAULT_ADDR/v1/pki/ca" \
    crl_distribution_points="$VAULT_ADDR/v1/pki/crl"

vault write pki/roles/rabbit \
    allowed_domains="$domain" \
    allow_subdomains=true \
    generate_lease=true \
    max_ttl="720h"

sudo cp /vagrant/vault/mshome.crt /usr/local/share/ca-certificates/mshome.crt
sudo update-ca-certificates

If you don’t want scary screens in FireFox and Chrome, you’ll need to install the mshome.crt certificate into your trust store.

Next up, we have some policies we need in Vault. The first deals with what the Nomad Server(s) are allowed to do - namely to handle tokens for itself, and anything in the nomad-cluster role. A full commented version of this policy is available here.

path "auth/token/create/nomad-cluster" {
  capabilities = ["update"]
}

path "auth/token/roles/nomad-cluster" {
  capabilities = ["read"]
}

path "auth/token/lookup-self" {
  capabilities = ["read"]
}

path "auth/token/lookup" {
  capabilities = ["update"]
}

path "auth/token/revoke-accessor" {
  capabilities = ["update"]
}

path "sys/capabilities-self" {
  capabilities = ["update"]
}

path "auth/token/renew-self" {
  capabilities = ["update"]
}

As this policy mentions the nomad-cluster role a few times, let’s have a look at that also:

{
  "disallowed_policies": "nomad-server",
  "explicit_max_ttl": 0,
  "name": "nomad-cluster",
  "orphan": true,
  "period": 259200,
  "renewable": true
}

This allows a fairly long-lived token to be created, which can be renewed. It is also limiting what the tokens are allowed to do, which can be done as either a block list (disallowed_policies) or an allow list (allowed_policies). In this case, I am letting the Clients access any policies except the nomad-server policy.

We can install both of these into Vault:

vault policy write nomad-server /vagrant/vault/nomad-server-policy.hcl
vault write auth/token/roles/nomad-cluster @/vagrant/vault/nomad-cluster-role.json

Nomad

Now that Vault is up and running, we should configure Nomad to talk to it. This is done in two places - the Server configuration, and the Client configuration.

To configure the Nomad Server, we update it’s configuration file to include a vault block, which contains a role name it will use to generate tokens (for itself and for the Nomad Clients), and an initial token.

vault {
    enabled = true
    address = "http://localhost:8200"
    task_token_ttl = "1h"
    create_from_role = "nomad-cluster"
    token = "some_token_here"
}

The initial token is generated by the ./server.sh script - how you go about doing this in production will vary greatly depending on how you are managing your machines.

The Nomad Clients also need the Vault integration enabling, but in their case, it only needs the location of Vault, as the Server node(s) will provide tokens for the clients to use.

vault {
    enabled = true
    address = "http://nomad1.mshome.net:8200"
}

Job Requirements

Before we go about changing the job itself, we need to write some data into Vault for the job to use:

  • Credentials: Username and password for the RabbitMQ Management UI, and the RABBITMQ_ERLANG_COOKIE
  • A policy for the job allowing Certificate Generation and Credentials access

Credentials

First off, we need to create a username and password to use with the Management UI. This can be done via the Vault CLI:

vault kv put secret/rabbit/admin \
    username=administrator \
    password=$(cat /proc/sys/kernel/random/uuid)

For the Erlang Cookie, we will also generate a Guid, but this time we will store it under a separate path in Vault so that it can be locked down separately to the admin username and password if needed:

vault kv put secret/rabbit/cookie \
    cookie=$(cat /proc/sys/kernel/random/uuid)

Job Policy

Following the principle of Least Privilege, we will create a policy for our rabbit job which only allows certificates to be generated, and rabbit credentials to be read.

path "pki/issue/rabbit" {
  capabilities = [ "create", "read", "update", "delete", "list" ]
}

path "secret/data/rabbit/*" {
  capabilities = [ "read" ]
}

This is written into Vault in the same way as the other policies were:

vault policy write rabbit /vagrant/vault/rabbit-policy.hcl

Rabbit Job Configuration

The first thing we need to do to the job is specify what policies we want to use with Vault, and what to do when a token or credential expires:

task "rabbit" {
  driver = "docker"

  vault {
    policies = ["default", "rabbit"]
    change_mode = "restart"
  }
  #...
}

Certificates

To configure RabbitMQ to use SSL, we need to provide it with values for 3 environment variables:

  • RABBITMQ_SSL_CACERTFILE - The CA certificate
  • RABBITMQ_SSL_CERTFILE - The Certificate for RabbitMQ to use
  • RABBITMQ_SSL_KEYFILE - the PrivateKey for the RabbitMQ certificate

So let’s add a template block to the job to generate and write out a certificate. It’s worth noting that line endings matter. You either need your .nomad file to use LF line endings, or make the template a single line and use \n to add the correct line endings in. I prefer to have the file with LF line endings.

template {
  data = <<EOH




EOH
  destination = "secrets/rabbit.pem"
  change_mode = "restart"
}

As we want to use the Nomad node’s hostname within the common_name parameter of the secret, we need to use a variable to fetch and format the value:


This can then be used by the with secret block to fetch a certificate for the current host:


Now that we have a certificate in the ./secrets/ directory, we can add a couple of volume mounts to the container, and set the environment variables with the container paths to the certificates. Note how the root certificate is coming from the /vagrant directory, not from Vault itself. Depending on how you are provisioning your machines to trust your CA, you will have a different path here!

config {
  image = "pondidum/rabbitmq:consul"
  # ...
  volumes = [
    "/vagrant/vault/mshome.crt:/etc/ssl/certs/mshome.crt",
    "secrets/rabbit.pem:/etc/ssl/certs/rabbit.pem",
    "secrets/rabbit.pem:/tmp/rabbitmq-ssl/combined.pem"
  ]
}

env {
  RABBITMQ_SSL_CACERTFILE = "/etc/ssl/certs/mshome.crt"
  RABBITMQ_SSL_CERTFILE = "/etc/ssl/certs/rabbit.pem"
  RABBITMQ_SSL_KEYFILE = "/etc/ssl/certs/rabbit.pem"
  #...
}

You should also notice that we are writing the secrets/rabbit.pem file into the container twice: The second write is to a file in /tmp as a workaround for the docker-entrypoint.sh script. If we don’t create this file ourselves, the container script will create it by combining the RABBITMQ_SSL_CERTFILE file and RABBITMQ_SSL_KEYFILE file, which will result in an invalid certificate, and a nightmare to figure out…

If the Vault integration in Nomad could write a single generated secret to two separate files, we wouldn’t need this workaround. Alternatively, you could make a custom container with a customised startup script to deal with this for you.

You can see the version of this file with only these changes here

Credentials

Now that we have things running with a certificate, it would be a great idea to start using the Erlang Cookie value and Management UI credentials we stored in Vault earlier. This is a super easy change to support in the Nomad file - we need to add another template block, but this time set env = true which will instruct nomad that the key-values in the template should be loaded as environment variables:

template {
    data = <<EOH
    
    RABBITMQ_ERLANG_COOKIE=""
    
    
    RABBITMQ_DEFAULT_USER=
    RABBITMQ_DEFAULT_PASS=
    
EOH
    destination = "secrets/rabbit.env"
    env = true
}

The complete nomad file with both certificates and credentials can be seen here.

Running!

Now, all we need to do is start our new secure cluster:

nomad job run rabbit/secure.nomad

Client Libraries

Now that you have a secure version of RabbitMQ running, there are some interesting things which can be done with the client libraries. While you can just use the secure port, RabbitMQ also supports Peer Verification, which means that the client has to present a certificate for itself, and RabbitMQ will validate that both certificates are signed by a common CA.

This process can be controlled with two environment variables:

  • RABBITMQ_SSL_VERIFY set to either verify_peer or verify_none
  • RABBITMQ_SSL_FAIL_IF_NO_PEER_CERT set to true to require client certificates, false to make them optional

In .net land, if you are using MassTransit, the configuration looks like this:

var bus = Bus.Factory.CreateUsingRabbitMq(c =>
{
    c.UseSerilog(logger);
    c.Host("rabbitmq://nomad1.mshome.net:5671", r =>
    {
        r.Username("some_application");
        r.Password("some_password");
        r.UseSsl(ssl =>
        {
            ssl.CertificatePath = @"secrets/app.crt";
        });
    });
});

There are also lots of other interesting things you can do with SSL and RabbitMQ, such as using the certificate as authentication rather than needing a username and password per app. But you should be generating your app credentials dynamically with Vault too…

Wrapping Up

Finding all the small parts to make this work was quite a challenge. The Nomad gitter was useful when trying to figure out the certificates issue, and being able to read the source code of the Docker image for RabbitMQ was invaluable to making the Certificate work.

If anyone sees anything I’ve done wrong, or could be improved, I’m happy to hear it!

infrastructure, vagrant, nomad, consul, rabbitmq, vault

---

Hyper-V, Docker, and Networking Drama

22 Mar 2019

I had a major problem a few hours before giving my Nomad: Kubernetes Without the Complexity talk this morning: the demo stopped working.

Now, the first thing to note is the entire setup of the demo is scripted, and the scripts hadn’t changed. The only thing I had done was restart the machine, and now things were breaking.

The Symptoms

A docker container started inside the guest VMs with a port mapped to the machine’s public IP wasn’t resolvable outside the host.

For example, using a machine based off the bento/ubuntu-16.04 base box, provisioned with docker, running this from inside an SSH connection to the machine would work:

vagrant ssh

# launch a container which can respond to a http get
docker run -d --rm -p 172.127.48.105:5000:5000 registry:latest

# curl by public ip
curl http://172.127.48.105:5000 --silent -w "%{http_code}"   # 200

But running the same curl command on the host would fail:

# container is still running
curl http://172.127.48.105:5000 --silent -w "%{http_code}"   # timeout

Investigation

So it’s 5 hours before the demo (thankfully it’s not 10 minutes before), so let’s start digging into what could be causing this.

Docker Networking

I also was searching for Nomad and Docker networking issues - as I figured I could change the Nomad job to bind the container to all interfaces (e.g. -p 5000:5000) instead of just the one IP. This reply mentioned the docker0 network, and when I checked the guest machines, I saw that this network is also in the 172.* range.

So my guest machines had public addresses which happened to fall in the same range as a separate network adaptor on that machine.

Hyper-V IP Addresses

While I was checking the Windows Firewall to see if anything was weird in there, I stumbled across a rule I’d added to allow exposure of a NodeJS service from my host to Hyper-v guests (but not anywhere else). I noticed that the IP range it defined was 192.168.*, and I now had machines with 172.* addresses.

So the IP address range for guest machines had changed.

The Solution

Luckily, there is a straightforward solution to this:

Reboot until you get the range you want

Really.

The other solution is to use an External Switch in Hyper-V and bridge it with your host’s internet connection, which doesn’t really help me, as I am on a laptop, on different WiFi networks, and sometimes I use a thunderbolt based network adaptor too. And having to update/rebuild machines on every network change would be an absolute pain.

So I rebooted — a lot.

So if anyone from Microsoft is reading this: Please let us configure the Default Switch. Or have a way to recreate it without rebooting at least.

vagrant, docker, hyperv, networking

---

RabbitMQ clustering with Consul in Nomad

28 Jan 2019

Update If you want a secure version of this cluster, see Running a Secure RabbitMQ Cluster in Nomad.

RabbitMQ is the centre of a lot of micros service architectures, and while you can cluster it manually, it is a lot easier to use some of the auto clustering plugins, such as AWS (EC2), Consul, Etcd, or Kubernetes. As I like to use Nomad for container orchestration, I thought it would be a good idea to show how to cluster RabbitMQ when it is running in a Docker container, on an unknown host (i.e. one picked by Nomad.)

I ran into a few problems trying to get this working, but a lot of searching and some help from the RabbitMQ mailing list (thanks Luke!) got me through all the issues, so hopefully, this will be easier next time and for other people too.

It is also worth noting that this is only going to be covering how to make a cluster work, not how to make it secure (setting up TLS etc.) for production usage. There is a lot of documentation on the RabbitMQ website for further reading on this!

The full repository with all of the demo code is available on my Github.

Nomad Cluster

As this post is mostly about running RabbitMQ on Nomad, and not setting up Nomad, I’ll give the basics here - the full scripts are in the repository, and linked below too.

Vagrant is used to create us a three node cluster. As I use Hyper-V for VMs, I can’t set static IPs from the Vagrant file, so I have used another Vagrant feature: triggers.

Triggers let us specify scripts to run after Vagrant actions, so in this case, we run a script after machine1 comes up which writes out it’s IP to the /vagrant share. The other machines can then read this same file to join the cluster:

Vagrant.configure(2) do |config|
  config.vm.box = "bento/ubuntu-16.04"

  config.vm.provision "shell", path: "./provision.sh", privileged: false

  config.vm.define "n1" do |n1|
    n1.vm.provision "shell", path: "./server.sh", privileged: false

    n1.trigger.after :up do |trigger|
      trigger.run_remote = { inline: "ip route get 1 | awk '{print $NF;exit}' > /vagrant/server_ip" }
    end
  end

  config.vm.define "n2" do |n2|
    n2.vm.hostname = "nomad2"
    n2.vm.provision "shell", path: "./client.sh", privileged: false
  end
end

The provision.sh script downloads and installs both Consul and Nomad, and then the respective server.sh and client.sh scripts set up both services in the right ways. The server machine also acts as a Nomad client, so that we don’t need 4 VMs running.

VagrantFile, provision.sh, server.sh, client.sh

All that remains to be done is vagrant up…and wait. But while we’re waiting for the machines to provision, let’s have a look at RabbitMQ clustering, and how we’ll define the job in Nomad.

RabbitMQ Cluster

A few things to note about clustering RabbitMQ:

  • All nodes must be listening on the same port for clustering (4369 by default)
  • The ERL_EPMD_PORT variable doesn’t work on rabbitmq < 3.7.9
  • The latest Docker image for rabbitmq is 3.7.8
  • The rabbit node names must be DNS resolvable
  • The RABBITMQ_ERLANG_COOKIE must have the same value

The rabbitmq_peer_discovery_consul plugin we will use is shipped with RabbitMQ by default but is disabled. The easiest way to get everything up and running is to create your own docker container, with the plugin enabled, and a small configuration file to set a few options:

FROM rabbitmq:management-alpine
COPY rabbitmq.conf /etc/rabbitmq
RUN rabbitmq-plugins enable --offline rabbitmq_peer_discovery_consul

The rabbitmq.conf only needs a few lines:

cluster_formation.peer_discovery_backend = rabbit_peer_discovery_consul
cluster_formation.consul.svc_addr_auto = true
docker build -t rabbitmq:consul .

An image built from this is also available on docker hub.

Once we have a custom container built, it’s a good idea to test that it actually works, before we start trying to get Nomad to run it. We’ll do this by creating a network in Docker so that all the containers can talk directly to each other on their pre-defined ports. Don’t forget to change CONSUL_HOST to your machine’s IP address!

docker network create rabbit
docker run -d --rm --name consul -p 8500:8500 consul

docker run -d --rm --name rabbit1 -h rabbit1 --network rabbit -p 30001:15672 -e RABBITMQ_ERLANG_COOKIE='rabbit' -e 'RABBITMQ_DEFAULT_USER=test' -e 'RABBITMQ_DEFAULT_PASS=test' -e CONSUL_HOST='10.0.75.1' rabbitmq:consul
docker run -d --rm --name rabbit2 -h rabbit2 --network rabbit -p 30002:15672 -e RABBITMQ_ERLANG_COOKIE='rabbit' -e 'RABBITMQ_DEFAULT_USER=test' -e 'RABBITMQ_DEFAULT_PASS=test' -e CONSUL_HOST='10.0.75.1' rabbitmq:consul
docker run -d --rm --name rabbit3 -h rabbit3 --network rabbit -p 30003:15672 -e RABBITMQ_ERLANG_COOKIE='rabbit' -e 'RABBITMQ_DEFAULT_USER=test' -e 'RABBITMQ_DEFAULT_PASS=test' -e CONSUL_HOST='10.0.75.1' rabbitmq:consul

You can now visit http://localhost:30001 (or 30002 or 30003) and see that we have a successful cluster running. Once you’re happy with it, you can kill it all off (as we started the containers with the --rm flag, Docker will delete them for us when they stop):

docker stop rabbit1 rabbit2 rabbit3 consul
docker network rm rabbit

Nomad Rabbit Job

Now that we know our container clusters successfully, we can create a Job definition to do the same thing in Nomad. Nomad jobs are defined in HCL, a Json-like configuration language.

The jobs require a name, which datacentre it should run in, and what kind of job type it is. In this case, our job is called rabbit (imaginative I know), we’ll run it in dc1 (the default value Nomad starts with), and we’ll make this job be a service, as opposed to a batch or system job:

job "rabbit" {

  datacenters = ["dc1"]
  type = "service"

  group "cluster" {
    # tasks ...
  }
}

The group is used to hold a collection of tasks, and when allocating a job, Nomad will make sure that all tasks belonging to an instance of a group are on the same node.

So for example, if you had a 2 node Nomad cluster, and 3 instances of a group which contained 3 tasks (e.g. API, varnish, and nginx containers), Nomad might distribute the groups like so:

image of several nodes with groups of containers

Within the group, we can specify the number of instances we want with the count property, and we also specify that for both updates and migrations, only one group can be changed at a time. This means that if you decide to upgrade the container used by the job, Nomad won’t stop all instances at once, destroying your service’s availability!

We also specify that we want to use the health checks (defined later on) rather than the state of the task itself to determine what is healthy, and how long the task must be healthy for before we decide it’s actually healthy, and how long it has to achieve being healthy.

group "cluster" {
  count = 3

  update {
    max_parallel = 1
  }

  migrate {
    max_parallel = 1
    health_check = "checks"
    min_healthy_time = "5s"
    healthy_deadline = "30s"
  }
}

The task is our unit of work in Nomad. In this case, we are using the docker driver, but it also supports many other drivers including exec, rkt and lxc. We configure which image to use, and importantly that the hostname is the name from Nomad!

The port_map tells nomad which ports of the container we want to expose, and labels them. We can then refer to the ports by their labels in other parts of the configuration.

task "rabbit" {
  driver = "docker"

  config {
    image = "pondidum/rabbitmq:consul"
    hostname = "${attr.unique.hostname}"
    port_map {
      amqp = 5672
      ui = 15672
      epmd = 4369
      clustering =
    }
  }
}

The env section is pretty self-explanatory; they are environment variables to pass to the container. As Consul is running on the Nomad host, we use the Nomad interpolation attribute to specify the IP of the current host, and we also set the RABBITMQ_ERLANG_COOKIE to a specific value. In a production environment, you should be setting this value to something unguessable, possibly using the Vault intergration in Nomad to fetch a token. We can also add other settings to pass to the container here, such as RABBITMQ_DEFAULT_USER and RABBITMQ_DEFAULT_PASS. As with the cookie generation, in a production-like environment, you’d probably want to use the Vault integration to pull the values for these variables.

env {
  RABBITMQ_ERLANG_COOKIE = "rabbitmq"
  CONSUL_HOST = "${attr.unique.network.ip-address}"
}

The resources section lets us constraints on things like CPU, Memory, IOPs, and Network. In our case, we are only specifying a set of ports to expose on the network, and that we want them to be bound to specific ports on the host:

resources {
  network {
    port "amqp" { static = 5672 }
    port "ui" { static = 15672 }
    port "epmd" { static = 4369 }
    port "clustering" { static = 25672 }
  }
}

We could select different ports to bind the container ports to, or leave out the static pair entirely to have Nomad map the ports to random unused ports on the host.

Finally, the service block integrates with service discovery (so, Consul), and allows us to register ports and health checks for our service. In the case of our RabbitMQ cluster, we already have service discovery integration via the RabbitMQ Consul plugin, so this registration is only used for the check feature, which is what will also be used by the migrate block to see if a task is healthy:

service {
  check {
    name     = "alive"
    type     = "tcp"
    port     = "ui"
    interval = "10s"
    timeout  = "2s"
  }
}

The check is using the ui port defined earlier to check if the UI is alive. We could also change the health check to use the amqp port instead, as that might be a better indication that the actual service can do useful things. We can define multiple checks, and are not limited to TCP; grpc, http, and script are also supported.

The entire job definition is below, and is also available in the repository.

job "rabbit" {

  datacenters = ["dc1"]
  type = "service"

  group "cluster" {
    count = 3

    update {
      max_parallel = 1
    }

    migrate {
      max_parallel = 1
      health_check = "checks"
      min_healthy_time = "5s"
      healthy_deadline = "30s"
    }

    task "rabbit" {
      driver = "docker"

      config {
        image = "pondidum/rabbitmq:consul"
        hostname = "${attr.unique.hostname}"
        port_map {
          amqp = 5672
          ui = 15672
          epmd = 4369
          clustering = 25672
        }
      }

      env {
        RABBITMQ_ERLANG_COOKIE = "generate_a_guid_-_or_something_for_this"
        RABBITMQ_DEFAULT_USER = "test"
        RABBITMQ_DEFAULT_PASS = "test"
        CONSUL_HOST = "${attr.unique.network.ip-address}"
      }

      resources {
        network {
          port "amqp" { static = 5672 }
          port "ui" { static = 15672 }
          port "epmd" { static = 4369 }
          port "clustering" { static = 25672 }
        }
      }

      service {
        name = "rabbitmq"
        port = "ui"
        check {
          name     = "alive"
          type     = "tcp"
          interval = "10s"
          timeout  = "2s"
        }
      }

    }
  }
}

Running The Job

First, make sure your console can talk to Nomad, which we can do by using the server_ip file again:

export NOMAD_ADDR="http://$(cat server_ip):4646"

Now it should be possible to run the job:

nomad job run rabbit/rabbit.nomad

After a few moments, we can visit any of the Nomad hosts, and log in to the RabbitMQ UI (http://SOME_SERVER_IP:15672) and you should be greeted with a running cluster with three nodes:

rabbitmq cluster dashboard

Homework

  • Kill a container on one of the nodes and see what happens (vagrant ssh n2 then docker stop <SOME_CONTAINER_ID>)
  • Create an Application which you deploy to Nomad which uses service discovery to talk to RabbitMQ
  • Create some more Nomad clients, and try making a bigger RabbitMQ cluster

If you have any questions, feel free to comment below or send me a tweet.

infrastructure, vagrant, nomad, consul, rabbitmq

---

Testing Immutable Infrastructure

01 Jan 2019

In my previous post, I glossed over one of the most important and useful parts of Immutable Infrastructure: Testability. There are many kinds of tests we can write for our infrastructure, but they should all be focused on the machine/service and maybe it’s nearest dependencies, not the entire system.

While this post focuses on testing a full machine (both locally in a VM, and remotely as an Amazon EC2 instance), it is also possible to do most of the same kind of tests against a Docker container. In fact, one of the tools used in this post supports building Docker containers as an output in parallel to the AMIs, so this can also assist in providing a migration path to/from Docker.

As an example, I will show how I built and tested a LogStash machine, including how to verify that the script to create the production machine is valid, that the machine itself has been provisioned correctly, and that the services inside work as expected.

I have published all the source code to GitHub. The examples in this post are all taken from the repository but might have a few bits removed just for readability. Check the full source out if you are interested!

Repository Structure and Tools

When it comes to building anything that you will have lots of, consistency is key to making it manageable. To that end, I have a small selection of tools that I use, and a repository structure I try and stick to. They are the following:

Vagrant - This is a tool for building and managing virtual machines. It can be backed by many different providers such as Docker, HyperV and VirtualBox. We’ll use this to build a local Linux machine to develop and test LogStash in. I use the HyperV provisioner, as that is what Docker For Windows also uses, and HyperV disables other virtualisation tools.

Packer - This tool provides a way to build machine images. Where Vagrant builds running machines, Packer builds the base images for you to boot, and can build multiple different ones (in parallel) from one configuration. We’ll use this to create our AMIs (Amazon Machine Images.)

Jest - This is a testing framework written in (and for) NodeJS applications. Whatever testing tool works best for your environment is what you should be using, but I use Jest as it introduces minimal dependencies, is cross-platform, and has some useful libraries for doing things like diffing json.

The repository structure is pretty simple:

  • scripts/
  • src/
  • test/
  • build.sh
  • logstash.json
  • package.json
  • vagrantfile

The src directory is where our application code will live. If the application is compiled, the output goes to the build directory (which is not tracked in source-control.) The test directory will contain all of our tests, and the scripts directory will contain everything needed for provisioning our machines.

We’ll describe what the use of each of these files is as we go through the next section.

Local Development

To create our virtual machine locally, we will use Vagrant. To tell Vagrant how to build our machine, we need to create a vagrantfile in our repository, which will contain the machine details and provisioning steps.

The machine itself has a name, CPU count, and memory specified. There is also a setting for Hyper-V which allows us to use a differencing disk, which reduces the startup time for the VM, and how much disk space it uses on the host machine.

For provisioning, we specify to run the relevant two files from the scripts directory.

Vagrant.configure("2") do |config|
    config.vm.box = "bento/ubuntu-16.04"

    config.vm.provider "hyperv" do |hv|
        hv.vmname = "LogStash"
        hv.cpus = 1
        hv.memory = 2048
        hv.linked_clone = true
    end

    config.vm.provision "shell", path: "./scripts/provision.sh"
    config.vm.provision "shell", path: "./scripts/vagrant.sh"
end

To keep things as similar as possible between our development machine and our output AMI, I keep as much of the setup script in one file: scripts/provision.sh. In the case of our LogStash setup, this means installing Java, LogStash, some LogStash plugins, and enabling the service on reboots:

#! /bin/bash

# add elastic's package repository
wget -qO - https://artifacts.elastic.co/GPG-KEY-elasticsearch | sudo apt-key add -
echo "deb https://artifacts.elastic.co/packages/6.x/apt stable main" | sudo tee -a /etc/apt/sources.list.d/elastic-6.x.list
sudo apt-get update

# install openjdk and set environment variable
sudo apt-get install openjdk-8-jre -y
JAVA=$(readlink -f $(which java) | sed "s:bin/java::")
echo "JAVA_HOME=$JAVA" | sudo tee --append /etc/environment

#install logstash and plugins
sudo apt-get install logstash -y
/usr/share/logstash/bin/logstash-plugin install logstash-filter-uuid
/usr/share/logstash/bin/logstash-plugin install logstash-filter-prune

sudo systemctl enable logstash.service

Vagrant will automatically mount it’s working directory into the VM under the path /vagrant. This means we can add a second provisioning script (scripts/vagrant.sh) to link the /vagrant/src directory to the LogStash configuration directory (/etc/logstash/conf.d), meaning we can edit the files on the host machine, and then restart LogStash to pick up the changes.

#! /bin/bash
sudo rm -rf /etc/logstash/conf.d
sudo ln -s /vagrant/src /etc/logstash/conf.d

sudo systemctl start logstash.service

Now that we have a vagrantfile, we can start the virtual machine with a single command. Note, Hyper-V requires administrator privileges, so you need to run this command in an admin terminal:

vagrant up

After a while, your new LogStash machine will be up and running. If you want to log into the machine and check files an processes etc., you can run the following command:

vagrant ssh

An argument can also be provided to the ssh command to be executed inside the VM, which is how I usually trigger LogStash restarts (as it doesn’t seem to detect when I save the config files in the src directory):

vagrant ssh -c 'sudo systemctl restart logstash'

Deployment

To create the deployable machine image, I use Packer. The process is very similar to how Vagrant is used: select a base AMI, create a new EC2 machine, provision it, and save the result as a new AMI.

Packer is configured with a single json file, in this case, named logstash.json. The file is split into four parts: variables, builders, provisioners, and outputs. I won’t include the outputs section as it’s not needed when building AMIs.

Variables

The variables property is for all configuration that you can pass to Packer. Their values can come from Environment Variables, CLI parameters, Consul, Vault, and others. In the LogStash example, there are three variables:

{
  "variables": {
    "aws_access_key": "",
    "aws_secret_key": "",
    "ami_users": "{{env `AMI_ACCOUNTS`}}"
  }
}

The aws_access_key and aws_secret_key are known names - unless we specify some value, they will automatically be read from your AWS config (in ~/.aws/), or if running on EC2, from the EC2 machine profile.

The ami_users is a custom variable which will read the AMI_ACCOUNTS environment variable by default. This particular one is used so that I can grant access to the resulting AMI to multiple AWS accounts, which is useful if you’re running in an Organisation with multiple Accounts. For example, if the AMI is built in a common account, and will be deployed into dev, qa and prod accounts, then you would populate the AMI_ACCOUNTS as a CSV of account IDs.

Builders

Packer can build many different kinds of machine image, but for this, we only need one: amazon-ebs.

{
  "builders": [
    {
      "type": "amazon-ebs",
      "access_key": "{{user `aws_access_key`}}",
      "secret_key": "{{user `aws_secret_key`}}",
      "region": "eu-west-1",
      "instance_type": "t2.micro",
      "source_ami_filter": {
        "filters": {
          "virtualization-type": "hvm",
          "name": "ubuntu/images/*ubuntu-xenial-16.04-amd64-server-*",
          "root-device-type": "ebs"
        },
        "owners": ["099720109477"],
        "most_recent": true
      },
      "ssh_username": "ubuntu",
      "ami_name": "logstash {{timestamp}}",
      "ami_users": "{{user `ami_users`}}"
    },
  ]
}

The two most interesting properties of this are source_ami_filter and ami_users. The source_ami_filter works in a very similar manner to the AWS CLI’s describe-images --filters parameter, albeit in a more readable format. In this case, I am specifying that I want an ubuntu-xenial base, and I want it to be an official Canonical image, so specify their Account ID as the owner. I also specify the most_recent property, as this filter will return all versions of this AMI which Canonical publish.

The ami_users is what lets me grant access to the AMI from other accounts (rather than just making it public). The property’s value should be an array, but Packer is smart enough to expand the CSV in the user variable into an array for us.

Provisioners

The provisioners array items are executed in the order they are specified. To set up the machine, I use the shell provisioner to create a temporary directory, then the file provisioner to upload the files in the src directory to that temporary directory. Finally a second shell provisioner uploads and runs the scripts/provision.sh and scripts/aws.sh files.

{
  "provisioners": [
    {
      "type": "shell",
      "inline": "mkdir -p /tmp/src"
    },
    {
      "type": "file",
      "source": "./src/",
      "destination": "/tmp/src"
    },
    {
      "type": "shell",
      "scripts": ["./scripts/provision.sh", "./scripts/aws.sh"]
    }
  ]
}

The aws.sh file is very small and does roughly the same thing as the vagrant.sh script, but rather than symlinking the /vagrant directory, it moves the uploaded src directory into the right location for LogStash:

#! /bin/sh

sudo rm /etc/logstash/conf.d/*
sudo cp -r /tmp/src/* /etc/logstash/conf.d

Note that this doesn’t start the LogStash service - this gets done by the UserData when we launch a new instance, as often we need to pass in additional configuration parameters, and don’t want the service running until that has been done.

Running

To create the AMI, we need to invoke packer. If I am running packer on a remote machine via SSH, I run it inside tmux, so that disconnects don’t fail the process:

packer build -var "ami_users=111,222,333" logstash.json

After a while, Packer will finish, leaving you with an output which will include the new AMI ID:

==> Builds finished. The artifacts of successful builds are:
--> amazon-ebs: AMIs were created:

eu-west-1: ami-123123123

We’ll get back to this output later when we create a build script that will also run our tests. Before we get to that, however, let’s look at how we can write tests which target both the local Vagrant machine and the AMI too.

Testing

To test the machines, I am using Jest. There isn’t anything particularly interesting going on in the package.json, other than a few babel packages being installed so that I can use ES6 syntax:

{
  "scripts": {
    "watch": "jest --watch",
    "test": "jest "
  },
  "devDependencies": {
    "babel-core": "^6.26.3",
    "babel-jest": "^23.6.0",
    "babel-preset-env": "^1.7.0",
    "jest": "^23.6.0",
    "regenerator-runtime": "^0.13.1"
  }
}

Packer Configuration Testing

There are a number of tests we can do to make sure our Packer configuration is valid before running it. This includes things like checking the base AMI is from a whitelisted source (such as our accounts, Amazon and Canonical). The test has to handle the possibility of multiple builders, and that some builders might not have a source_ami_filter. It also handles if no owner has been specified at all, which we also consider a “bad thing”:

const ourAccounts = [ "111111", "222222", "333333", "444444" ];
const otherOwners = [ "amazon", "099720109477" /*canonical*/ ];

describe("ami builder", () => {

  it("should be based on a whitelisted owner", () => {
    const allOwners = ourAccounts.concat(otherOwners);
    const invalidOwners = owners => owners.filter(owner => !allOwners.includes(owner));

    const amisWithInvalidOwners = packer.builders
      .filter(builder => builder.source_ami_filter)
      .map(builder => ({
        name: builderName(builder),
        invalidOwners: invalidOwners(builder.source_ami_filter.owners || [ "NO OWNER SPECIFIED" ])
      }))
      .filter(builders => builders.invalidOwners.length > 0);

    expect(amisWithInvalidOwners).toEqual([]);
  });

});

I also test that certain variables (ami_users) have been defined, and have been used in the right place:

describe("variables", () => {
  it("should have a variable for who can use the ami", () => {
    expect(packer.variables).toHaveProperty("ami_users");
  });

  it("should read ami_users from AMI_ACCOUNTS", () => {
    expect(packer.variables.ami_users).toMatch(
      /{{\s*env\s*`AMI_ACCOUNTS`\s*}}/
    );
  });
});

describe("ami builder", () => {
  it("should set the ami_user", () => {

    const invalidUsers = packer.builders
      .map(builder => ({
        name: builderName(builder),
        users: builder.ami_users || "NO USERS SPECIFIED"
      }))
      .filter(ami => !ami.users.match(/{{\s*user\s*`ami_users`\s*}}/));

    expect(invalidUsers).toEqual([]);
  });
})

Other tests you might want to add are that the base AMI is under a certain age, or that your AMI has certain tags included, or that it is named in a specific manner.

Machine Testing

Machine testing is for checking that our provisioning worked successfully. This is very useful, as subtle bugs can creep in when you don’t verify what happens.

For example, a machine I built copied configuration directory to a target location but was missing the -r flag, so when I later added a subdirectory, the machine failed as the referenced files didn’t exist.

So that the tests work with both the Vagrant and Packer built versions, we take in their address and key paths from the environment:

import { spawnSync } from "child_process";
import { createConnection } from "net";

// figure out where to look these up
const host = process.env.LOGSTASH_ADDRESS; // e.g. "172.27.48.28";
const keyPath = process.env.LOGSTASH_KEYPATH; // ".vagrant/machines/default/hyperv/private_key";

We also define two helper methods: one to check if a TCP port is open, and one which uses SSH to execute a command and read the response in the machine:

const execute = command => {
  const args = [`[email protected]${host}`, `-i`, keyPath, command];
  const ssh = spawnSync("ssh", args, { encoding: "utf8" });
  const lines = ssh.stdout.split("\n");

  if (lines[lines.length - 1] === "") {
    return lines.slice(0, lines.length - 1);
  }
  return lines;
};

const testPort = port => new Promise((resolve, reject) => {
  const client = createConnection({ host: host, port: port });

  client.on("error", err => reject(err));
  client.on("connect", () => {
    client.end();
    resolve();
  });
});

We can then add some tests which check the files were written to the right place, that port 5044 is open, and port 9600 is closed:

describe("the machine", () => {

  it("should have the correct configuration", () => {
    const files = execute("find /etc/logstash/conf.d/* -type f");

    expect(files).toEqual([
      "/etc/logstash/conf.d/beats.conf",
      "/etc/logstash/conf.d/patterns/custom.txt"
    ]);
  });

  it("should be listening on 5044 for beats", () => testPort(5044));
  it("should not be listening on 9600", () => expect(testPort(9600)).rejects.toThrow("ECONNREFUSED"));
});

Of course, as we can execute any command inside the machine, we can check pretty much anything:

  • tail the LogStash log and see if it’s got the right contents
  • check if the service is started
  • check the service is enabled on boot
  • check the environment variables been written to the right files

Application Testing

There are two styles of Application Testing: white-box and black-box. White-box will be tests run on the application inside the machine, using minimal external dependencies (preferably none at all), and Black-box will be run on the application from outside the machine, either using direct dependencies, or fakes.

It’s worth noting that both white-box and black-box tests are slow, mostly down to how slow LogStash is at starting up, although only giving it 1 CPU and 2Gb of RAM probably doesn’t help.

Whitebox Testing LogStash

To white-box test LogStash, I use a technique partially based on the Agolo LogStash Test Runner. The process for the tests is to run LogStash interactively (rather than as a service), send it a single event, record the output events, and compare them to an expected output.

The test cases are kept in separate folders, with two files. First is the input file, imaginatively called input.log, which will contain one json encoded event per line. The format needs to match what the result of FileBeat sending an event to LogStash would be. In this case, it means a few extra fields, and a message property containing a string of json. Formatted for readability, the object looks like this:

{
  "@timestamp": "2018-12-27T14:08:24.753Z",
  "beat": { "hostname": "Spectre", "name": "Spectre", "version": "5.3.0" },
  "fields": { "environment": "local", "log_type": "application" },
  "input_type": "log",
  "message": "{\"Timestamp\": \"2018-12-18T17:06:27.7112297+02:00\",\"Level\": \"Information\",\"MessageTemplate\": \"This is the {count} message\",\"Properties\": {\"count\": 4,\"SourceContext\": \"LogLines.GetOpenPurchasesHandler\",\"ApplicationName\": \"FileBeatTest\",\"CorrelationId\": \"8f341e8e-6b9c-4ebf-816d-d89c014bad90\",\"TimedOperationElapsedInMs\": 1000}}",
  "offset": 318,
  "source": "D:\\tmp\\logs\\single.log",
  "type": "applicationlog"
}

I also define an output.log, which contains the expected result(s), again one json encoded event per line. The example pipeline in the repository will emit two events for a given input, so this file contains two lines of json (again, newlines added for readability here):

{
  "source": "D:\\tmp\\logs\\single.log",
  "@version": "1",
  "fields": { "log_type": "application", "environment": "local" },
  "@timestamp": "2018-12-18T15:06:27.711Z",
  "offset": 318,
  "ApplicationName": "FileBeatTest",
  "host": "ubuntu-16",
  "type": "applicationlog",
  "CorrelationId": "8f341e8e-6b9c-4ebf-816d-d89c014bad90",
  "MessageTemplate": "This is the {count} message",
  "Level": "Information",
  "Context": "LogLines.GetOpenPurchasesHandler",
  "TimeElapsed": 1000,
  "Properties": { "count": 4 }
}
{
  "duration": 1000000,
  "timestamp": 1545145586711000,
  "id": "<generated>",
  "traceid": "8f341e8e6b9c4ebf816dd89c014bad90",
  "name": "LogLines.GetOpenPurchasesHandler",
  "localEndpoint": { "serviceName": "FileBeatTest" }
}

To enable sending the lines directly to LogStash (rather than needing to use FileBeat), we define an input.conf file, which configures LogStash to read json from stdin:

input {
  stdin { codec => "json_lines" }
}

And an ouput.conf file which configures LogStash to write the output as json lines a known file path:

output {
  file {
    path => "/tmp/test/output.log"
    codec => "json_lines"
  }
}

The tests need to be run inside the machine itself, so I created a script in the ./scripts directory which will do all the work, and can be run by the execute method in a Jest test. The script stops the LogStash service, copies the current configuration from the ./src directory and the replacement input.conf and output.conf files to a temporary location, and then runs LogStash once per test case, copying the result file to the test case’s directory.

#! /bin/bash

sudo systemctl stop logstash

temp_path="/tmp/test"
test_source="/vagrant/test/acceptance"

sudo rm -rf "$temp_path/*"
sudo mkdir -p $temp_path
sudo cp -r /vagrant/src/* $temp_path
sudo cp $test_source/*.conf $temp_path

find $test_source/* -type d | while read test_path; do
    echo "Running $(basename $test_path) tests..."

    sudo /usr/share/logstash/bin/logstash \
        "--path.settings" "/etc/logstash" \
        "--path.config" "$temp_path" \
        < "$test_path/input.log"

    sudo touch "$temp_path/output.log"   # create it if it doesn't exist (dropped logs etc.)
    sudo rm -f "$test_path/result.log"
    sudo mv "$temp_path/output.log" "$test_path/result.log"

    echo "$(basename $test_path) tests done"
done

sudo systemctl start logstash

To execute this, we use the beforeAll function to run it once - we also pass in Number.MAX_SAFE_INTEGER as by default beforeAll will time out after 5 seconds, and the test.sh is slow as hell (as LogStash takes ages to start up).

Once the test.sh script has finished running, we load each test’s output.log and result.log files, parse each line as json, compare the objects, and print out the delta if the objects are not considered equal:

const source = "./test/acceptance";
const isDirectory = p => fs.lstatSync(p).isDirectory();

const cases = fs
  .readdirSync(source)
  .map(name => path.join(source, name))
  .filter(isDirectory);

describe("logstash", () => {
  beforeAll(
    () => execute("/vagrant/scripts/test.sh"),
    Number.MAX_SAFE_INTEGER);

  test.each(cases)("%s", directoryPath => {
    const expected = readFile(path.join(directoryPath, "output.log"));
    const actual = readFile(path.join(directoryPath, "result.log"));

    const diffpatch = new DiffPatcher({
      propertyFilter: (name, context) => {
        if (name !== "id") {
          return true;
        }

        return context.left.id !== "<generated>";
      }
    });

    const delta = diffpatch.diff(expected, actual);
    const output = formatters.console.format(delta);

    if (output.length) {
      console.log(output);
    }

    expect(output.length).toBe(0);
  });
});

Blackbox Testing LogStash

As the machine has ports open for FileBeat and will send it’s output to ElasticSearch, we can set up a fake HTTP server, send some log events via FileBeat to the VM and check we receive the right HTTP calls to our fake server.

While looking on how to do this, I came across the lumberjack-protocol package on NPM, but unfortunately, it only supports lumberjack v1, and FileBeat and LogStash are now using v2, so you would have to use a local copy of filebeat to do the sending.

Due to the complexity of implementing this, and the diminished return on investment (the other tests should be sufficient), I have skipped creating the Blackbox tests for the time being.

AMI Testing

The final phase! Now that we are reasonably sure everything works locally, we need to build our AMI and test that everything works there too, as it would be a shame to update an Auto Scale Group with the new image which doesn’t work!

All that needs to happen to run the tests against an EC2 instance is to set the three environment variables we used with Vagrant, to values for communicating with the EC2 instance. To do this, we’ll need the EC2 IP Address, the username for SSH, and the private key for SSH authentication.

The first thing our build script needs to do is create the AMI. This is done in the same way as mentioned earlier, but with the slight difference of also piping the output to tee:

packer_log=$(packer build logstash.json | tee /dev/tty)
ami_id=$(echo "$packer_log" | tail -n 1 | sed 's/.*\(ami.*\)/\1/')

By using tee, we can pipe the build log from Packer to both the real terminal (/dev/tty), and to a variable called packer_log. The script then takes the last line and uses some regex to grab the AMI ID.

Next up, the script uses the AWS CLI to launch an EC2 instance based on the AMI, and store it’s IP Address and Instance ID:

json=$(aws ec2 run-instances \
  --image-id "$ami_id" \
  --instance-type t2.small \
  --key-name "$keypair_name" \
  --region eu-west-1 \
  --subnet-id "$subnet_id" \
  --security-group-ids "$security_group_id" \
  --tag-specifications 'ResourceType=instance,Tags=[{Key=Name,Value=logstash-verification}]' \
  --user-data "$userdata")

instance_id=$(echo "$json" | jq -r .Instances[0].InstanceId)
private_ip=$(echo "$json" | jq -r .Instances[0].PrivateIpAddress)

The IP Address is then used to set up the environment variables which the node test scripts use to locate the machine:

LOGSTASH_ADDRESS="$private_ip"
LOGSTASH_SSH="ubuntu"
LOGSTASH_KEYPATH="~/.ssh/id_rsa" build ou

npm run test

Finally, the script uses the Instance ID to terminate the instance:

aws ec2 terminate-instances \
  --instance-ids "$instance_id"

Wrapping Up

Hopefully, this (rather long) post is a useful introduction (!) to how I tackle testing Immutable Infrastructure. All of these techniques for testing the machine and application can be used for testing things like Docker containers too (and handily, Packer can be used to create Docker containers also).

As mentioned earlier The Repository is available here.

logstash, microservices, infrastructure, vagrant, packer, aws, testing

---

Code-free tracing with LogStash and Jaeger

22 Dec 2018

I wanted to show request charts (similar to the network tab in firefox) for requests across our microservices but wanted to do so in the least invasive way possible.

Firefox network timeline

We already use LogStash to collect logs from multiple hosts (via FileBeat) and forward them on to ElasticSearch, so perhaps I can do something to also output from LogStash to a tracing service.

app to filebeat to logstash to elasticsearch and opentracing

There are a number of tracing services available (AppDash, Jaeger, Zipkin), but unfortunately LogStash doesn’t have plugins for any of them or for OpenTracing. I picked Jaeger, as it seemed to be the easiest to get up and running, and also had a “Zipkin compatible” API too if you didn’t want to use the Jaeger one.

Setting up and running Jaeger in a production environment is way out of scope for this post, but for testing, I am just using the all-in-one Docker container:

docker run -d --nameron jaeger \
  -e COLLECTOR_ZIPKIN_HTTP_PORT=9411 \
  -p 5775:5775/udp \
  -p 6831:6831/udp \
  -p 6832:6832/udp \
  -p 5778:5778 \
  -p 16686:16686 \
  -p 14268:14268 \
  -p 9411:9411 \
  jaegertracing/all-in-one

The tracing APIs work in roughly the same way: all traces have the same traceid, and are sub-divided into spans. Spans have a unique id and can have a parent (span) id set. The trouble with sending data from LogStash is that I don’t have any parent IDs available. Well, at least without going and adding it to all our applications, and I’m not sure how I would want to propagate it correctly either.

So the first question is: can I push multiple spans without any parent IDs, and have it plot them as I would expect.

Is span.parentid required?

As the HTTP API documentation seems to be non-existent (other than “there is one!”), and most of the other docs mention the Thrift API, I decided to use that to test the theory. The Jaeger C# client library has a nice API to use, but I wanted to manually specify things (e.g. traceid, spanid, etc.) and the API has no way to do that, as pretty much everything is marked as internal.

Instead, I cloned the client repository, and added a test (which had access to all the internals I wanted):

public class Scratch
{
    private readonly HttpSender _sender;
    private readonly Tracer _tracer;
    private readonly TraceId _id;

    public Scratch(ITestOutputHelper output)
    {
        _sender = new HttpSender("http://localhost:14268/api/traces");
        _tracer = new Tracer.Builder("FirstService").Build();
        _id = TraceId.NewUniqueId(false);

        output.WriteLine(_id.ToString());
    }

    [Fact]
    public async Task Send()
    {
        var start = DateTime.UtcNow;

        await Create(start, start.AddSeconds(5));
        await Create(start, start.AddSeconds(3));

        await Create(start.AddSeconds(5), start.AddSeconds(10));
        await Create(start.AddSeconds(10), start.AddSeconds(20));
        await Create(start, start.AddSeconds(20));

        await _sender.FlushAsync(CancellationToken.None);
    }

    private async Task Create(DateTime start, DateTimeOffset finish)
    {
        var context = new SpanContext(_id, SpanId.NewUniqueId(), new SpanId(0), SpanContextFlags.None);
        var span = new Span(_tracer, "test", context, start, new Dictionary<string, object>(), null);
        span.Finish(finish);

        await _sender.AppendAsync(span, CancellationToken.None);
    }
}

The graph rendered how I hoped, proving that I don’t need to set the span’s parentid for everything with the same traceid to work correctly!

How to generate a consistent TraceID?

All our services pass along or generate a correlationID for each request, so my idea here was to do some conversion to put it in the traceid, which is an int64. Our correlationIDs are guids, which internally is an int128, so I figured I would take the least significant 8 bytes and use that. This can be done in C# like so:

var correlationID = Guid.NewGuid();
var traceID = new TraceId(BitConverter.ToInt64(correlationID.ToByteArray(), 8));

I noticed, however, that the TraceID structure in the Jaeger client can take two values: int64 low and int64 high, so I can use the entire correlationid.

How to ship from LogStash?

After reading through the LogStash Output Plugins docs it looked like there were three possible ways to ship the events to Jaeger: writing a command line app; invoked through either exec or pipe, writing a LogStash plugin, or doing something with the http output.

CLI adaptor

The idea of this method would be to write a CLI app which takes an event from LogStash and then sends it to Jaeger using the Thrift API. The app itself could be invoked using the exec or pipe outputs. I didn’t want to use this method for a few reasons:

Development: The Jaeger client has things I need hidden from the public surface, such as being able to specify the traceid, rather than generating a new one each time, so I would either have to reimplement the classes I need, use reflection to instantiate things, or try opening Pull Requests to expose more internals of the library.

Maintenance: I’d rather not have yet another thing to maintain if I can avoid it, not to mention all the additional things like repository location, build agents, etc.

Deployment: Our LogStash box is a plain Ubuntu box with nothing other than Java and LogStash installed on it. As I would probably end up using C# to write the app, I’d need to add even more dependencies.

Throughput: Jaeger (and the others) support sending traces in batches for better efficiency. Writing an app which supported that would involve writing the batching and caching, and the likelihood is I would mess it up, and increase the chances of lost messages.

LogStash plugin

Basically has the same problems as above, with the addition of me needing to learn enough about Java to build something reliable, and as this is a test and no one else in my team really knows Java either, so I ruled this out.

Http?

The Jaeger collector process can also expose a Zipkin compatible API, and while the documentation for that API is also terrible, I was able to find a Pull Request on Github with some structure defined. I figured I could do most of this directly with LogStash’s configuration DSL, assuming I can: generate a uuid (logstash-filter-uuid plugin), and remove all extra stuff from the post body (logstash-filter-prune).

Building the LogStash message

The first step of my configuration is based on what we need to do to Serilog messages to make them a bit nicer in Kibana:

Pre-processing

json {
    source => "message"
}

date {
    match => [ "Timestamp", "ISO8601", "yyyy-MM-dd'T'HH:mm:ss.SSSSSSSZZ"]
    target => "@timestamp"
}

mutate {
    rename => {
        "[Properties][ApplicationName]" => "ApplicationName"
        "[Properties][CorrelationId]" => "CorrelationId"
        "[Properties][SourceContext]" => "Context"
        "[Properties][TimedOperationElapsedInMs]" => "TimeElapsed"
    }
}

The json block tells LogStash that everything in the message property is a json object, rather than a string.

The date block extracts the time the message was recorded by the application and writes it to the special @timestamp field. This is a good idea as due to queuing, and processing times, the time the message was written and the time it was processed by LogStash will be different.

Lastly, the mutate block is moving a bunch of fields out of the message properties, and up to the top level of the object. We query by things like ApplicationName a lot, and it’s nice not to have to write Properties.ApplicationName:blahblah all the time.

Splitting

Our messages contain a lot of data which for the purposes of tracing, we just don’t care about. Not only is it a waste of bandwidth to send all the data to Jaeger, but also there could be clashing fields, and if we ever decide to use a hosted version of Jaeger, I don’t want to be sending them all the data either.

So how do we remove everything from the message, but still keep it in the message for the other outputs?

LogStash has an interesting filter called clone. What it does is copy the message as many times as you specify, and then runs the remaining part of the pipeline once per message, including the original message. In code, this just looks like this:

clone {
    clones => [ "jaeger" ]
}

If we wanted to split the message more times, we can just add more types to the clones array; there will be one extra message produced for each entry in the array, which will also be added to the message as the type property so we can filter on it later.

We don’t want to send the type field itself to Jaeger (or ElasticSeach), so we move it from the message to the @metadata object, which doesn’t get passed to outputs.

mutate {
    rename => { "type" => "[@metadata][type]" }
}

To handle the messages differently, we can just use an if statement on the @metadata.type property:

if [@metadata][type] == "jaeger" {
    # do stuff to the jaeger message
} else {
    # process the original message
}

Building the Http Payload

The hardest part of getting this to work was handling the dates. Jaeger specifies that timestamps should come in microseconds since epoch, and duration should be in microseconds also. We need to use the ruby filter so we can run some code to do this, rather than using an inbuilt filter primitive.

The duration is easy enough, as we report TimeElapsed in milliseconds:

ruby {
    code => "event.set('duration', (event.get('TimeElapsed') * 1000).floor)"
}

To handle the timestamp we need to convert it to epoch milliseconds. We can get the fractional seconds since the epoch by calling to_f on the @timestamp field, and then multiply by 1000 a couple of times to express it as microseconds. We also floor it to remove the leftover fraction:

ruby {
    code => "event.set('timestamp', (event.get('@timestamp').to_f * 1000 * 1000).floor)"
}

This on its own will work, but it won’t be right. We report log entries when an operation completes, which means our timestamp is the finish of the event, not the start. To solve this, we subtract the duration off the timestamp:

ruby {
    code => "event.set('timestamp', (event.get('@timestamp').to_f * 1000 * 1000).floor - event.get('duration'))"
}

Next, we need to create our span id. We need to generate a random id, consisting of letters and numbers, so the easiest way is to generate a uuid, and then tweak it to fit the Jaeger/Zipkin format. To do that we need to install the logstash-filter-uuid plugin:

/usr/share/logstash/bin/logstash-plugin install logstash-filter-uuid

There are a few things we need to do to generate a valid span id (for Jaeger):

  1. generate a uuid
  2. remove the dashes
  3. make it lowercase
  4. trim it to 16 chars

We use four filters to achieve this:

uuid {
    target => "id"
}
mutate {
    gsub => [ "id", "-", "" ]
    lowercase => [ "id" ]
}
mutate {
    gsub => [ "id", "(^.{1,16}).*$", "\1"]
}

The reason we specify the mutate twice is that order of the same operation (gsub in this case) is not guaranteed within a mutate block.

Next up we need to tweak the traceid from a guid to a uuid:

mutate {
    gsub => [ "traceid", "-", "" ]
    lowercase => [ "traceid" ]
}

Next, we can populate some other context into the message, by renaming the other fields we want. I discovered most of the fields and their formats from a combination of the thrift dtos and the this pull request.

mutate {
    rename => {
        "CorrelationId" => "traceid"
        "Context" => "name"
        "ApplicationName" => "[localEndpoint][serviceName]"
    }
}

Finally, we want to trim all the other properties from our message, which we can do by using the logstash-filter-prune plugin:

prune {
    whitelist_names => [ "id", "traceid", "name", "timestamp", "duration", "localEndpoint" ]
}
mutate {
    remove_field => [ "@timestamp" ]
}

We have to explicitly remove the @timestamp property as the prune plugin seems to ignore it (it also ignores the @metadata property).

The complete filter code can be made a little more efficient by combining several of the mutate blocks, provided we keep things in the right order, as follows:

ruby {
    code => "event.set('duration', (event.get('TimeElapsed') * 1000).floor)"
}

ruby {
    code => "event.set('timestamp', (event.get('@timestamp').to_f * 1000 * 1000).floor - event.get('duration'))"
}

uuid {
    target => "id"
}

mutate {
    rename => {
        "CorrelationId" => "traceid"
        "Context" => "name"
        "ApplicationName" => "[localEndpoint][serviceName]"
    }
    remove_field => [ "@timestamp" ]
}

mutate {
    gsub => [ "id", "-", "" ]
    gsub => [ "traceid", "-", "" ]
    lowercase => [ "traceid", "id" ]
}

mutate {
    gsub => [ "id", "(^.{1,16}).*$", "\1"]
}

prune {
    whitelist_names => [ "id", "traceid", "name", "timestamp", "duration", "localEndpoint" ]
}

Publishing to Jaeger

When publishing to the Jaeger API, you have to send it an array of json objects. To achieve this in the http output plugin, we can enable the format => json_batch, which will cause the HTTP client to buffer some events and then send them all at once. So more efficiency, and no extra code on our part.

To help make testing easier, I publish to two different HTTP endpoints: the Jaeger docker container running on my host machine, and to a Request Logger Bin; where I can inspect the response if something is reported as a failure by Jaeger, and replay it (via the Insomnia REST client) and check error messages returned from Jaeger.

output {
    if [@metadata][type] == "jaeger" {
        http {
            url => "http://192.168.0.115:9411/api/v2/spans"
            format => "json_batch"
            http_method => "post"
        }

        http {
            url => "http://requestloggerbin.herokuapp.com/bin/ff3da5b2-ec57-4611-ada8-b360ab38830c"
            format => "json_batch"
            http_method => "post"
        }
    } else {
        # normal message handling: elasticsearch etc.
        stdout { codec => rubydebug }
    }
}

I also used LogStash’s log output by tailing the log file:

tail -f /var/log/logstash/logstash-plain.log

Eventually, I had a working LogStash configuration and can view application traces in Jaeger.

Problems

One thing I cannot figure out how to do is sending the tags property to Jaeger. The Zipkin (and OpenTracing) APIs specify tags to be an object, e.g.

{
  "traceid": "1ee98d32185e41faba146c151ce8e27d",
  "id": "8377a6dae87947d7",
  "timestamp": 1545321238784000,
  "duration": 409000,
  "localEndpoint": { "serviceName": "PersonInformationService" },
  "name": "SomeOperation",
  "tags": {
    "CorrelationID": "1ee98d32-185e-41fa-ba14-6c151ce8e27d",
    "db.name": "test-blahblah"
  }
}

But LogStash wants the tags property to be an array of strings, and even using the remove_field on tags and then manually populating it doesn’t work. Currently, I don’t need anything in the tags, but it would be nice to have it working. I should ask on the Elastic forums perhaps.

Wrapping Up

This post turned out to be a lot longer than I was expecting, but hopefully, it was interesting. Next time I’ll try and go through how I do testing of immutable infrastructure, as well as local development of the base images.

jaeger, logstash, microservices, infrastructure

---