Writing Conference Talks

15 May 2018

I saw an interesting question on twitter today:

Hey, people who talk at things: How long does it take you to put a new talk together?

I need like 50 hours over at least a couple of months to make something I don’t hate. I’m trying to get that down (maybe by not doing pictures?) but wondering what’s normal for everyone else.

Source

I don’t know how long it takes me to write a talk - as it is usually spread over many weeks/months, worked on as and when I have inspiration. The actual processes is something like this:

  1. Think it through

    The start of this is usually with an idea for a subject I like a lot, such as Strong Typing, Feature Toggles, or Trunk Based Development. Where I live I walk everywhere (around 15k to 20k steps per day), which gives me a lot of time to think about things.

  2. Giant markdown file of bullet points which I might want to cover

    I write down all the points that I want to talk about into one markdown file, which I add to over time. I use the github checkbox markdown format (* [ ] some point or other) so I can tick thinks off later.

  3. Rough order of points at the bottom

    At the bottom of this notes file, I start writing an order of things, just to get a sense of flow. Once this order gets comfortable enough, I stop updating it and start using the real slides file.

  4. Start slides writing sections as I feel like it

    I start with the title slide, and finding a suitable large image for it. This takes way longer than you might imagine! For the rest of the slides, I use a mix of titles, text and hand drawn images.

    I use OneNote and Gimp to do the hand drawn parts, and usually the Google Cloud Platform Icons, as they’re the best looking (sorry Amazon!)

    Attribute all the images as you go. Much easier than trying to do it later.

  5. Re-order it all!

    I talk bits of the presentation through in my head, and shuffle bits around as I see fit. This happens a lot as I write the slides.

  6. Talk it through to a wall

    My wall gets me talking to it a lot. I talk it through outloud, and make note of stumbling points, and how long the talk takes, adding speaker notes if needed.

  7. Tweaks and re-ordering

    I usually end up making last minute tweaks and order switches as I figure out how to make something flow better. I am still not happy with some transitions in my best talks yet!

I write all my talks using RevealJS, mostly because I can write my slides as a markdown file and have it rendered in the browser, and partly because I’ve always used it.

To get things like the Speaker Notes view working, you need to be running from a webserver (rather than just from an html file on your filesystem.) For this I use NWS, which is a static webserver for your current working directory (e.g. cd /d/dev/presentations && nws).

Currently, I am trying to work out if I can use Jekyll or Hugo to generate the repository for me, as all the presentations have the same content, other than images, slides file, and a customise.css file. Still not sure on how best to achieve what I am after though.

You can see the source for all my talks in my Presentations Repository on github. The actual slides can be seen on my website here, and the videos (if available), I link to here.

productivity, talks, writing

---

Test Expressiveness

26 Feb 2018

We have a test suite at work which tests a retry decorator class works as expected. One of the tests checks that when the inner implementation throws an exception, it will log the number of times it has failed:

[Test]
public async Task ShouldLogRetries()
{
    var mockClient = Substitute.For<IContractProvider>();
    var logger = Subsitute.For<ILogger>();
    var sut = new RetryDecorator(mockClient, logger, maxRetries: 3);

    mockClient
        .GetContractPdf(Arg.Any<string>())
        .Throws(new ContractDownloadException());

    try
    {
        await sut.GetContractPdf("foo");
    }
    catch (Exception e){}

    logger.Received(1).Information(Arg.Any<string>(), 1);
    logger.Received(1).Information(Arg.Any<string>(), 2);
    logger.Received(1).Information(Arg.Any<string>(), 3);
}

But looking at this test, I couldn’t easily work out what the behaviour of sut.GetContractPdf("foo") was supposed to be; should it throw an exception, or should it not? That fact that there is a try...catch indicates that it might throw an exception, but doesn’t give any indication that it’s required or not.

try
{
    await sut.GetContractPdf("foo");
}
catch (Exception e)
{
}

Since we have the Shouldly library in use, I changed the test to be a little more descriptive:

Should.Throw<ContractDownloadException>(() =>
    sut.GetContractPdfForAccount("foo")
);

Now we know that when the decorator exceeds the number of retries, it should throw the inner implementation’s exception.

This in itself is better, but it also raises another question: Is the test name correct? Or should this now be two separate tests? One called ShouldLogRetries, and one called ShouldThrowInnerExceptionOnRetriesExceeded?

Even though I ended up adding the second test, I still left the first test with the Should.Throw(...) block, as it is still more descriptive at a glance than the try...catch.

code, c#, testing

---

Task Chaining and the Pipeline Operator

20 Feb 2018

Since I have been trying to learn a functional language (Elixir), I have noticed how grating it is when in C# I need to call a few methods in a row, passing the results of one to the next.

The bit that really grates is that it reads backwards, i.e. the rightmost function call is invoked first, and the left hand one last, like so:

await WriteJsonFile(await QueueParts(await ConvertToModel(await ReadBsxFile(record))));

In Elixir (or F# etc.) you can write this in the following way:

var task = record
    |> await ReadBsxFile
    |> await ConvertToModel
    |> await QueueParts
    |> await WriteJsonFile

While there are proposals for the forward pipe operator to be added to C# being discussed, it doesn’t look like it will happen in the near future.

Something close to this is Linq, and at first, I tried to work out a way to write the pipeline for a single object using the Select statement, something like this:

await record
    .Select(ReadBsxFile)
    .Select(ConvertToModel)
    .Select(QueueParts)
    .Select(WriteJsonFile);

The problem with this is that Linq doesn’t play well with async code - you end up needing to call .Result on each task selected…which is a Bad Thing to do.

I realised that as it’s just Tasks I really care about, I might be able to write some extension methods to accomplish something similar. I ended up with 3 extensions: one to start a chain from a value, and two to allow either Task<T> to be chained, or a Task:

public static class TaskExtensions
{
    public static async Task<TOut> Start<TIn, TOut>(this TIn value, Func<TIn, Task<TOut>> next)
    {
        return await next(value);
    }

    public static async Task<TOut> Then<TIn, TOut>(this Task<TIn> current, Func<TIn, Task<TOut>> next)
    {
        return await next(await current);
    }

    public static async Task Then<TIn>(this Task<TIn> current, Func<TIn, Task> next)
    {
        await next(await current);
    }
}

This can be used to take a single value, and “pipeline” it through a bunch of async methods:

var task = record
    .Start(ReadBsxFile)
    .Then(ConvertToModel)
    .Then(QueueParts)
    .Then(WriteJsonFile);

One of the nice things about this is that if I want to add another method in the middle of my chain, as long as it’s input and output types fit, it can just be inserted or added to the chain:

var task = record
    .Start(ReadBsxFile)
    .Then(ConvertToModel)
    .Then(InspectModelForRedundancies)
    .Then(QueueParts)
    .Then(WriteJsonFile)
    .Then(DeleteBsxFile);

You can see a real use of this in my BsxProcessor Lambda.

This is one of the great things about learning other programming languages: even if you don’t use them on a daily basis, they can really give you insight into different ways of doing things, doubly so if they are a different style of language.

code, c#

---

Tweaking Processes to Remove Errors

09 Dec 2017

When we are developing (internal) Nuget packages at work, the process used is the following:

  1. Get latest of master
  2. New branch feature-SomethingDescriptive
  3. Implement feature
  4. Push to GitHub
  5. TeamCity builds
  6. Publish package to the nuget feed
  7. Pull request
  8. Merge to master

Obviously 3 to 6 can repeat many times if something doesn’t work out quite right.

There are a number of problems with this process:

Pull-request after publishing

Pull requests are a great tool which we use extensively, but in this case, they are being done too late. By the time another developer has reviewed something, possibly requesting changes, the package is published.

Potentially broken packages published

As packages are test-consumed from the main package feed, there is the chance that someone else is working on another code base, and decides to update the nuget which you have just published. Now they are pulling in a potentially broken, or unreviewed package.

Published package is not nessacarily what is on master

Assuming the pull-request is approved with no changes, then the code is going to make it to master. However there is nothing to stop another developer’s changes getting to master first, and now you have a merge…and the published package doesn’t match what the source says it contains.

Feature/version conflicts with multiple developers

A few of our packages get updated fairly frequently, and there is a strong likelyhood that two developers are adding things to the same package. Both publish their package off their feature branch, and now someone’s changes have been “lost” as the latest package doesn’t have bother developer’s changes.

Soltuon: Continuous Delivery / Master Based Development

We can solve all of these issues by changing the process to be more “Trunk Based”:

  1. Get latest of master
  2. New branch feature-SomethingDescriptive
  3. Implement feature
  4. Push to GitHub
  5. Pull request
  6. TeamCity builds branch
  7. Merge to master
  8. TeamCity builds & publishes the package

All we have really changed here is to publish from master, rather than your feature branch. Now a pull-request has to happen (master branch is Protected in GitHub) before you can publish a package, meaning we have elimnated all of the issues with our previous process.

Except one, kind of.

How do developers test their new version of the package is correct from a different project? There are two solutions to this (and you could implement both):

  • Publish package to a local nuget feed
  • Publish packages from feature branches as -pre versions

The local nuget feed is super simple to implement: just use a directory e.g. I have /d/dev/local-packages/ defined in my machine’s nuget.config file. We use Gulp for our builds, so modifying our gulp publish task to publish locally when no arguments are specified would be trivial.

The publishing of Pre-release packages can also be implemented through our gulp scripts: we just need to adjust TeamCity to pass in the branch name to the gulp command (gulp ci --mode=Release --barnch "%vcsroot.branch%"), and we can modify the script to add the -pre flag to the version number if the branch parameter is not master.

Personally, I would use local publishing only, and implement the feature branch publishing if the package in question is consumed by multiple teams, and you would want an external team to be able to verify the changes made before a proper release.

Now our developers can still test their package works from a consuming application, and not clutter the nuget feed with potentially broken packages.

design, process

---

Evolutionary Development

17 Nov 2017

Having recently finished reading the Building Evolutionary Architectures: Support Constant Change book, I got to thinking about a system which was fairly representative of an architecture which was fine for it’s initial version, but it’s usage had outgrown the architecture.

Example System: Document Storage

The system in question was a file store for a multi user, internal, desktop based CRM system. The number of users was very small, and the first implementation was just a network file share. This was a fine solution to start with, but as the number of CRM users grew, cracks started to appear in the system.

A few examples of problems seen were:

  • Concurrent writes to the same files
  • Finding files for a specific record in the CRM
  • Response time
  • Files “going missing”
  • Storage size
  • Data retention rules

Most of this was caused by the number of file stored, which was well past the 5 million mark. For example, queries for “all files for x record” got slower and slower over time.

Samba shares can’t be listed in date-modified order (you actually get all the file names, then sorting is applied), which means you can’t auto delete old files, or auto index (e.g. export text to elasticsearch) updated files easily.

The key to dealing with this problem is to take small steps - if you have a large throughput to support, the last thing you want to do is break it for everyone at once, by doing a “big bang” release.

Not only can we take small steps in deploying our software, but we can also utilise Feature Toggles to make things safer. We can switch on a small part of the new system for a small percentage of users, and slowly ramp up usage while monitoring for errors.

Incremental Replacement

To replace this in an incremental manner, we are going to do the following 4 actions for every feature, until all features are done:

  1. Implement new feature in API and client
  2. Deploy client (toggle: off)
  3. Deploy API
  4. Start toggle roll out

Now that we know how each feature is going to be delivered, we can write out our list of features, in a rough implementation order:

  • Create API, build scripts, CI and deployment pipeline
  • Implement authentication on the API
  • Implement fetching a list of files for a record
  • Implement fetching a single file’s content for a record
  • Implement storing a single file for a record
  • Implement deletion of a single file for a record

The development and deployment of our features can be overlapped too: we can be deploying the next version of the client with the next feature off while we are still rolling out the previous feature(s). This all assumes that your features are nice and isolated however!

Once this list of features is done, and all the toggles are on, from the client perspective we are feature complete.

We are free to change how the backend of the API works. As long as we don’t change the API’s contract, the client doesn’t need any more changes.

Our next set of features could be:

  • Implement audit log of API actions
  • Publish store and delete events to a queue
  • Change our indexing process to consume the store and delete events
  • Make the samba hidden (except to the API)
  • Implement background delete of old documents
  • Move storage backend (to S3, for example)

This list of features doesn’t impact the front end (client) system, but the backend systems can now have a more efficient usage of the file store. As with the client and initial API development, we would do this with a quick, iterative process.

But we can’t do iterative because…

This is a common reaction when an iterative approach is suggested, and thankfully can be countered in a number of ways.

First off, if this is an absolute requirement, we can do our iterations an feature toggling rollouts to another environment, such a Pre-Production, or QA. While this reduces some of the benefits (we loose out on live data ramp up), it does at least keep small chunks of work.

Another work around is to use feature toggles anyway, but only have a couple of “trusted” users use the new functionality. Depending on what you are releasing, this could mean a couple of users you know, or giving a few users a non-visible change (i.e. they’re not aware they’ve been selected!) You could also use NDA (Non Disclosure Agreements) if you need to keep them quiet, although this is quite an extreme measure.

A final option is to use experiments, using an experimentation library (such as Github’s Scientist) which continues to use the existing features, but in parallel runs and records the results of the replacement feature. This obviously has to be done with care, as you don’t want to cause side effects.

How do you replace old software? Big bang, iterative, experimentation, or some other process?

design, architecture, process

---