Against SemVer

16 Dec 2018

Well, for Applications & Services at least. For libraries, SemVer is the way to go, assuming you can agree on what a breaking change is defined as.

But when it comes to Applications (or SaaS products, websites, etc.) SemVer starts to break down. The problem starts with the most obvious: What is a breaking change? How about a minor change?

What’s in a change?

For example, if we were to change the UI of a web application, which caused no backend changes, from the user perspective it is probably a breaking change, but not from the developers perspective. What about changing a backend process, for example, the way the service is billed? How about adding a new step in the middle of an existing process?

These are all hard questions to answer, and I imagine there are many many edge cases and things which are not clear as to what level of change they are.

Clash of the Versions

The next problem stems from the fact that we don’t (often) do Trunk Based Development for applications. We have a long-lived (1-2 weeks) feature branch, which might get pushed to a test environment multiple times as tasks are completed. If we SemVer these deployments, when a bug fix happens on the master branch, we can end up with a version clash, and can’t deploy.

branching, showing a clash of SemVer by having concurrent branches

While this is a problem, we can solve it easily - we can use the -pre or -beta suffix for the feature branch, and then remove the suffix and increment the version number when the feature is fully deployed. This, however, is adding a little more complexity to the process - mostly on the human side of things this time.

I would rather avoid the complexity (machine and human) entirely so instead opt for a different solution: Dates.

How about a Date?

Date stamps to the rescue! Our builds now use the following format:

<year>.<month>.<day>.<build_number>

We did consider using the time (represented as .hhmm) instead of the build_number, but it would have been possible to clash if two builds triggered at the same minute, and the build_number is guaranteed uniqueness. By using an automatic format, we gain a few advantages:

  • No confusion on whether a change is a major, minor, or patch
  • No clashes on multiple branches
  • No human input required (marking things as pre or beta)
  • Answers the “when was this built” question

What about the libraries?

The libraries have a very different set of requirements, and one versioning scheme doesn’t seem to fit both very well, so there is no point trying to force the matter.

code, ci, cd

---

Stopping Caring...

08 Dec 2018

…about GitHub open source commit streak.

This is, I think, partially triggered by Marc Gravell’s post. I currently have had a GitHub commit streak going on 1878 days. The other night I realised, that I don’t care about it any more, and more so, I’m not sure why I did to start with.

I didn’t even mean to start doing it. I just noticed one day that I had done something every day for a couple of weeks, and vaguely wondered how long I could keep that up for. It turns out the answer is “quite a while”. For the most part, it was easy to do - just a small commit a day, it only takes a few minutes right? But more and more often, I noticed that it was having a negative effect on what I was writing.

For instance, I wanted to learn about Nomad, but had been putting it off, as it wouldn’t involve making any public commits. So each time I came to start, I couldn’t until I had done a public commit, and once that commit had was done, so was my motivation. Which meant learning Nomad didn’t happen for a long time.

It also affected what I did with old repositories on GitHub - I wanted to remove some which serve no useful purpose any more, but I couldn’t do that either, as that would remove their commits from the streak. Likewise, I have a new version of my TwelveFactor repository, but I really should remove the old version. But doing so would remove the commits from the streak too. So, as you might have guessed, I haven’t done that either. Maybe soon. I have a lot of writing to go with it too.

On the subject of writing, that also suffered. It takes a lot of effort to sit down and start writing a post, but usually, the words flow pretty smoothly once I am going. But knowing I need to spare some energy for the commit of the day makes it even harder to start writing, so I just haven’t been. There are a lot of drafts with bullet-point lists waiting though, so maybe they will begin to see the light of day soon.

The irony is not lost on me that making this post will increase the commit count (GitHub pages based blog), so I am not committing (or finishing) this post until at least a day has passed.

I decided on the 7th December 2018 not to make any commits. After that had happened - the act of deciding mind you, not the day itself, I felt a lot better. I do not need to do anything. I can go and finish the book a friend recommended me. I can spend my evenings learning about things I want to. Or just do nothing some evenings.

No more worrying about public commits. It’s just not worth it.

code, books, health

---

Microservices or Components

28 Oct 2018

One of the reasons people list for using MicroServices is that it helps enforce separation of concerns. This is usually achieved by adding a network boundary between the services. While this is useful, it’s not without costs; namely that you’ve added a set of new failure modes: the network. We can achieve the same separation of concerns within the same codebase if we put our minds to it. In fact, this is what Simon Brown calls a Modular Monolith, and DHH calls the Majestic Monolith.

We recently needed to expand an existing service to have some new functionality. The current process looks something like this, where the user has done something which will eventually return them a URL which can be clicked to get to a web page to see the results.

api call does some work, returns a result_url which points to a web interface

The new process is an additional authentication challenge which the user will need to complete before they can get to the final results page. The new process looks like this:

api call does work, makes a request to challenge API, passing the result_url as an argument.  The challenge-response returns a challenge_url, which is returned to the user instead of the return_url

Design Decisions

Currently, the challenge functionality will only be used by this one service, but there is a high probability that we will need it for other services in the future too. At this point we have a decision to make: do we keep this functionality in-process, or make a separate microservice for it?

Time To Live

The first trade-off is time: it is slightly quicker to make it in-process, but if we do want to use this from somewhere else later, we’ll need to extract it; which is more work. The key here is “if” - we don’t know for sure that other services will need this exact functionality.

If we keep the new API and UI within the existing API and UI projects, we can also make some code reuse: there is a data store, data access tooling, permissions, styles that can be reused. Also, all of our infrastructure such as logging and monitoring is already in place, which will save us some time too.

API Risk

We want to avoid deploying a service which then needs to undergo a lot of rework in the future if the second and third users of it have slightly different requirements. If we build it as a separate service now, will we be sure we are making something which is generic and reusable by other services? Typically you only get the answer to this question after the second or third usage, so it seems unlikely that we would get our API design perfect on the first attempt.

Technical Risks

If we are to go the separate service route, we are introducing new failure modes to the existing API. What if the challenge API is down? What if the request times out? Are we using HTTP or a Message Broker to communicate with it?

If we keep the service in-process to start with we can eliminate all of these concerns. Luckily, we tend to have very thin controllers and make use of Mediatr, so the actual implementation of how the remote call is made can be hidden in the message handler to a certain extent.

Technical Decisions

As alluded to in the Time To Live point, we can reuse the existing data store and data access code, but this is a tradeoff in itself: what if the current storage tech is not quite ideal for the new requirements?

If the current service makes use of a complex Entity Framework model, but the new service is so simple that Dapper makes more sense, do we introduce the new dependency or not? What if we wanted to migrate away from one datastore to another (e.g. removing all MongoDB usage in favour of Postgres), but this is already using Mongo? We’d be increasing our dependency on a datastore we are explicitly trying to migrate away from.

All this assumes we want to write the service in the same programming language as the existing service! In our case we do but it’s worth considering if you have multiple languages in use already.

Finally on the data storefront, if we decide to extract this as a separate service later, we will have to take into account data migrations, and how we can handle that with little if any, downtime.

The Decision

After weighing up all these points (and a few others), we decided to keep the service inside the existing services. The Challenge API will live in its own area in the current API, and likewise, the Challenge UI will live in its own area in the existing UI.

How do we go about keeping it all separated though?

  • Communication we discuss all changes we want to make anyway, so the first line of defence to preventing the code becoming tightly coupled are these discussions.
  • Pull Requests someone will notice you are doing something which is reducing the separation, and a discussion about how to avoid this will happen.
  • Naming Conventions the Challenge API shares no naming of properties with the existing API. For example, the current API passes in a results_url and results_id, but the Challenge API stores and refers to these as the redirect_url and external_id.
  • Readme it’ll go into the repository’s readme file, along with any other notes which developers will find useful. The sequence diagrams we drew (with much more detail) will also go in here.

Technical Debt?

The final question on this decision is “Isn’t this technical debt we are introducing?”. The answer I feel is “no”, it feels much closer to applying the YAGNI Principle (You Ain’t Gonna Need It). While there is work in the backlog which can use a Challenge API at the moment, that doesn’t necessarily mean it will still be there next week, or if it will be pushed further back or changed later.

In the end, the meeting where we came up with this and drew things on the whiteboard together was productive, and likely much shorter than it took me to write all this down. We were able to resist the “cool hip microservice” trend and come up with a design which is pretty contained and composable with other systems in the future.

If after all this discussion we decided to go the MicroService route, I would still be happy with the decision, as we would have all this material to look back on and justify our choice, rather than waving our hands about and shouting “but microservices” loudly.

How do you go about designing systems? Microservice all the things? Monolith all the things? Or something in between which makes the most sense for the situation at hand?

architecture, microservices, design

---

SketchNotes: Finding Your Service Boundaries

10 Sep 2018

At NDC Oslo this year, I attended Adam Ralph’s talk on Finding Your Service Boundaries. I enjoyed it a lot, and once the video came out, I rewatched it, and decided to have a go at doing a “sketchnotes”, which I shared on Twitter, which people liked!

I’ve never done one before, but it was pretty fun. I made it in OneNote, zoomed out a lot, and took a screenshot.

Click for huuuuge!

sketchnotes - finding your service boundaries

sketchnotes, conference

---

Semantic Configuration Validation: Earlier

08 Sep 2018

After my previous post on Validating Your Configuration, one of my colleagues made an interesting point, paraphrasing:

I want to know if the configuration is valid earlier than that. At build time preferably. I don’t want my service to not start if part of it is invalid.

There are two points here, namely when to validate, and what to do with the results of validation.

Handling Validation Results

If your configuration is invalid, you’d think the service should fail to start, as it might be configured in a dangerous manner. While this makes sense for some service, others might need to work differently.

Say you have an API which supports both writing and reading of a certain type of resource. The read will return you a resource of some form, and the write side will trigger processing of a resource (and return you a 202 Accepted, obviously).

What happens if your configuration just affects the write side of the API? Should you prevent people from reading too? Probably not, but again it depends on your domain as to what makes sense.

Validating at Build Time

This is the far more interesting point (to me). How can we modify our build to validate that the environment’s configuration is valid? We have the code to do the validation: we have automated tests, and we have a configuration validator class (in this example, implemented using FluentValidation).

Depending on where your master configuration is stored, the next step can get much harder.

Local Configuration

If your configuration is in the current repository (as it should be) then it will be no problem to read.

public class ConfigurationTests
{
    public static IEnumerable<object[]> AvailableEnvironments => Enum
        .GetValues(typeof(Environments))
        .Cast<Environments>()
        .Select(e => new object[] { e });

    [Theory]
    [MemberData(nameof(AvailableEnvironments))]
    public void Environment_specific_configuration_is_valid(Environments environment)
    {
        var config = new ConfigurationBuilder()
            .AddJsonFile("config.json")
            .AddJsonFile($"config.{environment}.json", optional: true)
            .Build()
            .Get<AppConfiguration>();

        var validator = new AppConfigurationValidator();
        validator.ValidateAndThrow(config);
    }
}

Given the following two configuration files, we can make it pass and fail:

config.json:

{
  "Callback": "https://localhost",
  "Timeout": "00:00:30",
  "MaxRetries": 100
}

config.local.json:

{
  "MaxRetries": 0
}

Remote Configuration

But what if your configuration is not in the local repository, or at least, not completely there? For example, have a lot of configuration in Octopus Deploy, and would like to validate that at build time too.

Luckily Octopus has a Rest API (and acompanying client) which you can use to query the values. All we need to do is replace the AddJsonFile calls with an AddInMemoryCollection() and populate a dictionary from somewhere:

[Theory]
[MemberData(nameof(AvailableEnvironments))]
public async Task Octopus_environment_configuration_is_valid(Environments environment)
{
    var variables = await FetchVariablesFromOctopus(
        "MyDeploymentProjectName",
        environment);

    var config = new ConfigurationBuilder()
        .AddInMemoryCollection(variables)
        .Build()
        .Get<AppConfiguration>();

    var validator = new AppConfigurationValidator();
    validator.ValidateAndThrow(config);
}

Reading the variables from Octopus’ API requires a bit of work as you don’t appear to be able to ask for all variables which would apply if you deployed to a specific environment, which forces you into building the logic yourself. However, if you are just using Environment scoping, it shouldn’t be too hard.

Time Delays

Verifying the configuration at build time when your state is fetched from a remote store is not going to solve all your problems, as this little diagram illustrates:

test pass, a user changes value, deployment happens, startup fails

You need to validate in both places: early on in your process, and on startup. How you handle the configuration being invalid doesn’t have to be the same in both places:

  • In the build/test phase, fail the build
  • On startup, raise an alarm, but start if reasonable

Again, how you handle the configuration errors when your application is starting is down to your domain, and what your application does.

configuration, c#, strongtyping, stronk, validation

---