Branching and Red Builds

10 Aug 2018

So this is a bit of a rant…but hopefully with some solutions and workarounds too. So let’s kick things off with a nice statement:

I hate broken builds.

So everyone basically agrees on this point I think. The problem is that I mean all builds, including ones on shared feature branches.

Currently, I work on a number of projects which uses small(ish) feature branches. The way this works is that the team agrees on a new feature to work on creates a branch, and then each developer works on tasks, committing on their own branches, and Pull-Requesting to the feature branch. Once the feature branch is completed, it’s deployed and merged to master. We’ll ignore the fact that Trunk Based Development is just better for now.

branching, developers working on small tasks being merged into a feature branch

The problem occurs when one of the first tasks to be completed is writing behaviour (or acceptance) tests. These are written in something like SpecFlow, and call out to stubbed methods which throw NotImplementedException s. When this gets merged, the feature branch build goes red and stays red until all other tasks are done. And probably for a little while afterwards too. Nothing like “red-green-refactor” when your light can’t change away from red!

The Problems

  • Local tests are failing, no matter how much you implement
  • PullRequests to the feature branch don’t have passing build checks
  • The failing build is failing because:
    • Not everything is implemented yet
    • A developer has introduced an error, and no one has noticed yet
    • The build machine is playing up

branching, developers working on small tasks being merged into a feature branch showing everything as failed builds

Bad Solutions

The first thing we could do is to not run the acceptance tests on a Task branch’s build, and only when a feature branch build runs. This is a bad idea, as someone will have forgotten to check if their task’s acceptance tests pass, and will require effort later to fix the broken acceptance tests.

We could also implement the acceptance file and not call any stubbed methods, making the file a text file and non-executable. This is also a pretty bad idea - how much would you like to bet that it stays non-executable?

The Solution

Don’t have the acceptance tests as a separate task. Instead, split the criteria among the implementation tasks. This does mean that your other tasks should be Vertical Slices rather than Horizontal, which can be difficult to do depending on the application’s architecture.

An Example

So let’s dream up a super simple Acceptance Criteria:

  • When a user signs up with a valid email which has not been used, they receive a welcome email with an activation link.
  • When a user signs up with an invalid email, they get a validation error.
  • When a user signs up with an in-use email, they get an error

Note how this is already pretty close to being the tasks for the feature? Our tasks are pretty much:

  • implement the happy path
  • implement other scenarios

Of course, this means that not everything can be done in parallel - I imagine you’d want the happy path task to be done first, and then the other scenarios are probably parallelisable.

So our trade-off here is that we lose some parallelisation, but gain feedback. While this may seem insignificant, it has a significant impact on the overall delivery rate - everyone knows if their tasks are complete or not, and when the build goes red, you can be sure of what introduced the problem.

Not to mention that features are rarely this small - you probably have various separate acceptance criteria, such as being able to view an account page.

Oh, and once you can split your tasks correctly, there is only a small step to getting to do Trunk Based Development. Which would make me happy.

And developer happiness is important.

rant, git, ci, process, productivity, testing

---

Managing AppSettings in Consul

07 Aug 2018

Consul is a great utility to make running your microservice architecture very simple. Amongst other things, it provides Service Discovery, Health Checks, and Configuration. In this post, we are going to be looking at Configuration; not specifically how to read from Consul, but about how we put configuration data into Consul in the first place.

The usual flow for an application using Consul for configuration is as follows:

  1. App Starts
  2. Fetches configuration from Consul
  3. Configures itself
  4. Registers in Consul for Service Discovery
  5. Ready

Step 2 is very straightforward - you query the local instance of Consul’s HTTP API, and read the response into your configuration object (If you’re using Microsoft’s Configuration libraries on dotnet core, you can use the Consul.Microsoft.Extensions.Configuration NuGet package).

The question is though, how does the configuration get into Consul in the first place? Obviously, we don’t want this to be a manual process, and as Consul’s HTTP API supports writing too, it doesn’t have to be! But where is the master copy of the configuration data stored? Where it should be! In the repository with your code for the application.

repository structure, config.json, config.test.json and config.prod.json in the root

By default, all your configuration values should be going into the base configuration (config.json), and only use the environment specific versions (e.g. config.test.json and config.prod.json) when a value needs to differ in some environments.

Why store config in the repository?

There are many reasons for putting your configuration into a repository alongside the code it relates to, mostly around answering these questions:

  • When did this key’s value change?
  • Why did this key’s value change?
  • Who changed this (do they have more context for why)?
  • What values has this key been over time?
  • How often is this key changing?

If a value is changing often with reasons (commit messages) such as “scale the thing due to increased traffic” and “scale the thing back down now it’s quiet” that starts to tell you that you should be implementing some kind of autoscaling.

If you find out a key is set incorrectly, you can find out how long it’s been wrong, and maybe discover that the value is not “wrong” but “not right anymore”.

The final piece of this is that you know the value in production will match the value specified - there are no operators accidentally adding a 0 to the end of the number of threads to run etc.

Deployment

Now we just need to get the configuration from the file, and into Consul whenever it changes. As I use Terraform for deploying changes, I just need to update it to write to Consul also.

deployment pipeline - git to AppVeyor to Terraform.  Terraform writes to consul and updates ECS cluster

Terraform supports writing to Consul out of the box, however, Terraform can’t directly read parse json files, but we can use the external provider to get around that limitation:

data "external" "config_file" {
  program = ["cat", "config.json"]
}

resource "consul_key_prefix" "appsettings" {
  path_prefix = "appsettings/testapp/"
  subkeys = "${data.external.config_file.result}"
}

If we want to take things a step further, and use our environment specific overrides files, we just need to use the JQ command line tool to merge the two json files, which can be done like so:

jq -s '.[0] * .[1]' config.json config.test.json

Unfortunately, the external provider has a very specific syntax to how it is called, and we can’t just specify the jq command directly. So it needs to go into another file:

#! /bin/bash
jq -s '.[0] * .[1]' "[email protected]"

Finally, we can update the external block to use the new script. You could replace the second file with a merged string containing the current environment (e.g. "config.${var.environment}.json")

data "external" "config_file" {
  program = ["bash", "mergeconfigs.sh", "config.json", "config.test.json"]
}

The complete version of this is here in my Terraform Demos repository on GitHub.

What next?

Have a go managing your settings as part of your deployment pipeline! Depending on what tools you are using, you might need to implement your own HTTP posts to the Consul API, but the advantages of automating this task far outweigh the cost of writing some curl commands in my opinion!

microservices, consul, terraform, 12factor

---

Locking Vault Down with Policies

23 Jun 2018

The final part of my Vault miniseries focuses on permissioning, which is provided by Vault’s Policies.

As everything in Vault is represented as a path, the policies DSL (Domain Specific Language) just needs to apply permissions to paths to lock things down. For example, to allow all operations on the cubbyhole secret engine, we would define this policy:

path "cubbyhole/*" {
    capabilities = ["create", "read", "update", "delete", "list"]
}

Vault comes with a default policy which allows token operations (such as looking up its own token info, releasing and renewing tokens), and cubbyhole access.

Let’s combine the last two posts (Managing Postgres Connection Strings with Vault and Secure Communication with Vault) and create a Policy which will allow the use of generated database credentials. If you want more details on the how/why of the set up phase, see those two posts.

Setup

First, we’ll create two containers which will get removed on exit - a Postgres one and a Vault one. Vault is being started in dev mode, so we don’t need to worry about init and unsealing it.

docker run --rm -d -p 5432:5432 -e 'POSTGRES_PASSWORD=postgres' postgres:alpine
docker run --rm -d -p 8200:8200 --cap-add=IPC_LOCK -e VAULT_DEV_ROOT_TOKEN_ID=vault vault

Next, we’ll create our Postgres user account which Vault will use to create temporary credentials:

psql --username postgres --dbname postgres
psql> create role VaultAdmin with Login password 'vault' CreateRole;
psql> grant connect on database postgres to vaultadmin;

Let’s also configure the environment to talk to Vault as an administrator, and enable the two Vault plugins we’ll need:

export VAULT_ADDR="http://localhost:8200"
export VAULT_TOKEN="vault"

vault auth enable approle
vault secrets enable database

We’ll also set up our database secret engine, and configure database roll creation:

vault write database/config/postgres_demo \
    plugin_name=postgresql-database-plugin \
    allowed_roles="default" \
    connection_url="postgresql://:@10.0.75.1:5432/postgres?sslmode=disable" \
    username="VaultAdmin" \
    password="vault"

vault write database/roles/reader \
    db_name=postgres_demo \
    creation_statements="CREATE ROLE \"\" WITH LOGIN PASSWORD '' VALID UNTIL ''; \
        GRANT SELECT ON ALL TABLES IN SCHEMA public TO \"\";" \
    default_ttl="10m" \
    max_ttl="1h"

Creating a Policy

First, we need to create the policy. This can be supplied inline on the command line, but reading from a file means it can be source-controlled, and you something readable too!

While the filename doesn’t need to match the policy name, it helps make it a bit clearer if it does match, so we’ll call this file postgres-connector.hcl.

# vault read database/creds/reader
path "database/creds/reader" {
    capabilities = ["read"]
}

We can then register this policy into Vault. The write documentation indicates that you need to prefix the file path with @, but that doesn’t work for me:

vault policy write postgres-connector postgres-connector.hcl

Setup AppRoles

As before, we’ll create a demo_app role for our application to use to get a token. However this time, we’ll specify the policies field, and pass it in both default and our custom postgres-connector role.

vault write auth/approle/role/demo_app \
    policies="postgres-connector,default"

When we generate our client token using the secret_id and role_id, we’ll get a token which can create database credentials, as well as access the cubbyhole.

The final part of being an admin user for this is to generate and save the secret_id and role_id:

vault write -f -field=secret_id auth/approle/role/demo_app/secret-id
vault read -field=role_id auth/approle/role/demo_app/role-id

Creating a Token and Accessing the Database

Opening a new command line window, we need to generate our client token. Take the two id’s output from the admin window, and use them in the following code block:

export VAULT_ADDR="http://localhost:8200"
SECRET_ID="" # from the 'admin' window!
ROLE_ID="" # from the 'admin' window!

export VAULT_TOKEN=$(curl -X POST --data "{ \"role_id\":\"$ROLE_ID\", \"secret_id\":\"$SECRET_ID\" }" $VAULT_ADDR/v1/auth/approle/login | jq  -r .auth.client_token)

Now we have a client token, we can generate a database connection:

vault read database/creds/reader
# Key                Value
# ---                -----
# lease_id           database/creds/reader/dc2ae2b6-c709-0e2f-49a6-36b45aa84490
# lease_duration     10m
# lease_renewable    true
# password           A1a-1kAiN0gqU07BE39N
# username           v-approle-reader-incldNFPhixc1Kj25Rar-1529764057

Which can also be renewed:

vault lease renew database/creds/reader/dc2ae2b6-c709-0e2f-49a6-36b45aa84490
# Key                Value
# ---                -----
# lease_id           database/creds/reader/dc2ae2b6-c709-0e2f-49a6-36b45aa84490
# lease_duration     10m
# lease_renewable    true

However, if we try to write to the database roles, we get an error:

vault write database/roles/what dbname=postgres_demo
# Error writing data to database/roles/what: Error making API request.
#
# URL: PUT http://localhost:8200/v1/database/roles/what
# Code: 403. Errors:
#
# * permission denied

Summary

It is also a good idea to have separate fine-grained policies, which can then be grouped up against separate AppRoles, allowing each AppRole to have just the permissions it needs. For example, you could have the following Policies:

  • postgres-connection
  • postgres-admin
  • rabbitmq-connection
  • kafka-consumer

You would then have several AppRoles defined which could use different Policies:

  • App1: rabbitmq-connection, postgres-connection
  • App2: kafka-consumer, rabbitmq-connection
  • App3: postgres-admin

Which helps encourage you to have separate AppRoles for each of your applications!

Finally, the Vault website has a guide on how to do this too…which I only found after writing this! At least what I wrote seems to match up with their guide pretty well, other than I also use AppRole authentication (and so should you!)

vault, security, microservices

---

Secure Communication with Vault

22 Jun 2018

I think Vault by Hashicorp is a great product - I particularly love how you can do dynamic secret generation (e.g for database connections). But how do you validate that the application requesting the secret is allowed to perform that action? How do you know it’s not someone or something impersonating your application?

While musing this at an airport the other day, my colleague Patrik sent me a link to a StackOverflow post about this very question

The summary is this:

  1. Use an AppRole rather than a plain token
  2. Bake the RoleID into your application
  3. Provide a SecretID from the environment
  4. Combine both to get a token from Vault on startup
  5. Periodically renew said token.

Or, in picture form:

vault token flow

So let’s see how we can go about doing this.

0. Setup Vault

This time we will use Vault in dev mode, which means that it starts unsealed, and we can specify the root token as something simple. On the downside, there is no persistence; restarting the container gives you a blank slate. If you would prefer to use Vault with persistent storage, see Section 2 of the previous post:

docker run \
    -d --rm \
    --name vault_demo \
    --cap-add=IPC_LOCK \
    -e VAULT_DEV_ROOT_TOKEN_ID=vault \
    -p 8200:8200 \
    vault

As in the previous article, we’ll export the VAULT_TOKEN and VAULT_ADDR variables so we can use the Vault CLI:

export VAULT_ADDR="http://localhost:8200"
export VAULT_TOKEN="vault"

For our last setup step, we need to enable the AppRole auth method:

vault auth enable approle

1. Create A Role

Creating a role has many parameters you can specify, but for our demo_app role, we are going to skip most of them, just providing token_ttl and token_max_ttl.

vault write auth/approle/role/demo_app \
    token_ttl=20m \
    token_max_ttl=1h

2. Request A Secret ID

Vault has two modes of working, called Push and Pull. Push mode is when you generate the secret_id yourself and store it against the role. Pull mode is when you request Vault to generate the secret_id against the role and return it to you. I favour the Pull model, as it is one less thing to worry about (how to generate a secure secret_id.)

We have to specify the -force (shorthand -f) as we are writing a secret which has no key-value pairs, and as we are using the CLI, I have specified -field=secret_id which changes the command to only output the secret_id’s value, rather than the whole object.

export SECRET_ID=$(vault write -f -field=secret_id auth/approle/role/demo_app/secret-id)

echo $SECRET_ID
#> 119439b3-4eec-5e5b-ce85-c1d00f046234

3. Write Secret ID to Environment

This step would be done by another process, such as Terraform when provisioning your environment, or Spinnaker when deploying your containers.

As we are just using the CLI, we can pretend that $SECRET_ID represents the value stored in the environment.

4. Fetch Role ID

Next, assuming the role of the developer writing an app, we need fetch the role_id, for our demo_app role. As with fetching the secret_id, we specify the -field=role_id so we only get that part of the response printed:

vault read -field=role_id auth/approle/role/demo_app/role-id
#> 723d66af-3ddd-91c0-7b35-1ee51a30c5b8

5. Embed Role ID in Code

We’re on the CLI, and have saved the role_id into the $ROLE_ID variable, so nothing more to do here!

Let’s create a simple C# Console app to demo this with:

dotnet new console --name VaultDemo
dotnet new sln --name VaultDemo
dotnet sln add VaultDemo/VaultDemo.csproj
dotnet add VaultDemo/VaultDemo.csproj package VaultSharp

We also installed the VaultSharp NuGet package, which takes care of doing the client token fetching for you - but we will go through what this is doing internally later!

class Program
{
  private const string RoleID = "723d66af-3ddd-91c0-7b35-1ee51a30c5b8";

  static async Task Main(string[] args)
  {
    var auth = new AppRoleAuthenticationInfo(
      RoleID,
      Environment.GetEnvironmentVariable("SECRET_ID")
    );

    var client = VaultClientFactory.CreateVaultClient(
      new Uri("http://localhost:8200"),
      auth
    );

    await client.CubbyholeWriteSecretAsync("test/path", new Dictionary<string, object>
    {
      { "Name", "I'm a secret Name!" }
    });

    var secrets = await client.CubbyholeReadSecretAsync("test/path");
    Console.WriteLine(secrets.Data["Name"]);
  }
}

6. Deploy!

As we’re running locally, nothing to do here, but if you want, imagine that you created a docker container or baked an AMI and deployed it to the cloud or something!

7. Run / On Start

As we’ve already saved the SECRET_ID into an environment variable, we can just run the application:

dotnet run --project VaultDemo/VaultDemo.csproj
#> I'm a secret Name!

So what did the application do?

When run, the application used both the role_id from the constant and the secret_id environment variable to call Vault’s Login method. An equivalent curl command would be this:

curl -X POST \
    --data '{ "role_id":"723d66af-3ddd-91c0-7b35-1ee51a30c5b8", "secret_id":"119439b3-4eec-5e5b-ce85-c1d00f046234" }' \
    http://localhost:8200/v1/auth/approle/login

This will spit out a single line of json, but if you have jq in your path, you can prettify the output by appending | jq .:

{
  "request_id": "37c0e057-6fab-1873-3ec0-affaace26e76",
  "lease_id": "",
  "renewable": false,
  "lease_duration": 0,
  "data": null,
  "wrap_info": null,
  "warnings": null,
  "auth": {
    "client_token": "c14f5806-aff2-61b6-42c2-8920c8049b6c",
    "accessor": "aef3d4f4-d279-bcda-8d9c-2a3de6344975",
    "policies": [
      "default"
    ],
    "metadata": {
      "role_name": "demo_app"
    },
    "lease_duration": 1200,
    "renewable": true,
    "entity_id": "34b1094b-28d4-1fb0-b8f6-73ad28d80332"
  }
}

The line we care about is client_token in the auth section. The value is used to authenticate subsequent requests to Vault.

For instance, in the C# app we used the CubbyHole backend to store a Name. The equivalent curl commands would be:

export VAULT_TOKEN="c14f5806-aff2-61b6-42c2-8920c8049b6c"

# vault write cubbyhole/test/path name="Another manual secret"
curl -X POST \
    --header "X-Vault-Token: $VAULT_TOKEN" \
    --data '{ "Name": "Another manual secret" }' \
    http://localhost:8200/v1/cubbyhole/test/path

# vault list cubbyhole/test/path
curl -X GET \
    --header "X-Vault-Token: $VAULT_TOKEN" \
    http://localhost:8200/v1/cubbyhole/test/path

So why use the client library if it’s just HTTP calls? Simple - by using VaultSharp (or equivalent) we get token auto renewal handled for us, along with working APIs; no more guessing and head-scratching while trying to work out the proper HTTP call to make!

What Next?

Read up on what you can do with Roles - such as limiting token and secret lifetimes, usage counts, etc.

Next article will probably cover Vault’s Policies.

vault, security, microservices

---

Fixing Docker volume paths on Git Bash on Windows

18 Jun 2018

My normal development laptop runs Windows, but like a lot of developers, I make huge use of Docker, which I run under Hyper-V. I also heavily use the git bash terminal on windows to work.

Usually, everything works as expected, but I was recently trying to run an ELK (Elasticsearch, Logstash, Kibana) container, and needed to pass in an extra configuration file for Logstash. This caused me a lot of trouble, as nothing was working as expected.

The command I was running is as follows:

docker run \
    -d --rm \
    --name elk_temp \
    -p 5044:5044 \
    -p 5601:5601 \
    -p 9200:9200 \
    -v logstash/app.conf:/etc/logstash/conf.d/app.conf \
    sebp/elk

But this has the interesting effect of mounting the app.conf in the container as a directory (which is empty), rather than doing the useful thing of mounting it as a file. Hmm. I realised it was git bash doing path transformations to the windows style causing the issue, but all the work arounds I tried failed:

# single quotes
docker run ... -v 'logstash/app.conf:/etc/logstash/conf.d/app.conf'
# absolute path
docker run ... -v /d/dev/temp/logstash/app.conf:/etc/logstash/conf.d/app.conf
# absolute path with // prefix
docker run ... -v //d/dev/temp/logstash/app.conf:/etc/logstash/conf.d/app.conf

In the end, I found a way to switch off MSYS’s (what git bash is based on) path conversion:

MSYS_NO_PATHCONV=1 docker run \
    -d --rm \
    --name elk_temp \
    -p 5044:5044 \
    -p 5601:5601 \
    -p 9200:9200 \
    -v logstash/app.conf:/etc/logstash/conf.d/app.conf \
    sebp/elk

And Voila, the paths get passed through correctly, and I can go back to hacking away at Logstash!

git, docker, bash, windows

---