A Microservices Guideline

I originally wrote this document for my work, where we were deep into a microservices architecture. The goal was to create a set of guidelines to be used when developing a new service. This way we could ensure some consistent quality for the code, as well as have it slot nicely into our then infrastructure.

We're now moving away from that architecture, but I don't want this document to go to waste. So I'm posting it here. Maybe someone else will find it useful as a template.

Minimum Requirements / Microservice Guidelines

This document details some requirements we would like to enforce for all of our microservices to ensure some level of quality. Use this document as a checklist to determine if your service is production ready. These requirements are not necessary set in stone, more a guideline.

In addition to these guidelines we strongly recommend following the 12 Factor App methodology.

Dependencies and Versioning

Dependencies locked to the correct versions and maintainable. A build must be reproducible in the future. Never lock to a third party's version control branch, e.g. don't use some github repo's 'dev-master’ in composer.

Semantic versioning should be used to categorize our builds.

If possible use a service like Evergreen or Greenkeeper to automatically keep your dependencies up to date.

For most projects, we should use Git-flow to organize our branches and keep a separate develop and master branch. It also helps us manage our releases. On github, develop should be the default branch.

Environment Variables and Secrets

Passwords, tokens, URLS to backing services and other variable parameters to your application must be passed as environment variables.

Each microservice must store it's environment variables in Hashicorp Vault and never ever in git.

Secrets must never be stored in a git repository. This includes SSL certificates.

Testing and Code Quality

There should be regression tests for the important parts for the service. This helps to ensure that when we make code changes in the future the original functionality still works as intended, or that we are at least alerted to the fact that we broke existing functionality.

There should also be a way to measure the test coverage. This helps find parts of the code which could be better tested.

Use linting tools for your programming language and framework, this helps you follow standardized code conventions and hopefully keeps your code more concise and clean.

Documentation

Dev/Contributor documentation must exist that tell the next developer how to maintain the project. E.g. clear instructions on how to install dependencies, build, run, and test the project. It should be possible to pick up the project in a few months time. A good place to put these is in the README, which is also something that every project must have.

Logging

Make sure that your docker image logs to stdout. This makes troubleshooting one step easier as then we won't have to shell directly into the container to view the logs and they will be picked up by filebeat.

Have different levels of logging depending on development/staging/production. Use a library that can handle different levels of logging, i.e.TRACE, DEBUG, INFO, WARN, ERROR are pretty typical log levels.

Slotting into our Infrastructure

Every microservice must have a Dockerfile. Likely it will also need a Docker repository to push builds to. For this we use AWS ECR. Typically this docker push step is done during our CircleCI build.

Each microservice must also have one or more Nomad job definition files per runtime environment. e.g. staging.json for staging, and production.json for production.

For continous integration we use CircleCI, as such the project must have a CircleCI configuration.

For error tracking we use Sentry, as such the project should send exceptions, errors and warnings to Sentry.

Each HTTP microservice should also have a simple healthcheck route, typically under http://service-name/health. This is then input to Statuscake and Consul, which periodically polls the service.

... but most importantly

The service must also work.

Checklist template

You can use the following markdown file to create a checklist via gist or github issue. This can be used for review before onboarding a new service.

### Dependencies and Versioning
- [ ] Dependencies locked and sane
- [ ] Semantic Versioning used to label builds.
- [ ] Automatically updated dependencies via Greenkeeper (or similar)
- [ ] Git flow configured with `develop` as the default branch on github.

### Environment Variables
- [ ] Passwords, tokens and other configuration are passed via environment variables.
- [ ] Environment variables stored in Vault.
- [ ] No secrets stored in the git repository.

### Testing and Code Quality
- [ ] Tests exist, run and pass
- [ ] Test coverage is measured
- [ ] Linter is implemented and configured

### Documentation
- [ ] README exists
- [ ] Instructions for how to install, run, build and test the project are documented.

### Logging
- [ ] Docker image captures logs to stdout/stderr.
- [ ] A logging library with different log levels is used instead of normal prints.

### Slotting into our Infrastructure
- [ ] Has a Dockerfile and an AWS ECR docker repository to push to.
- [ ] Has Nomad job definition files.
- [ ] Has a CircleCI configuration with working builds.
- [ ] Has Sentry set up for Error Logging
- [ ] Has a `/health` healthcheck route.
- [ ] Has Statuscake healthcheck set up.
- [ ] Has Consul healthcheck set up.

### Does it work?
- [ ] Yes :)

Linux Services Cheatsheet

I originally wrote this rough cheatsheet for documentation at work, where I had to deal with different linux flavours' ways of configuring services / daemons.

These commands may be helpful when you are managing linux servers. Different versions of Ubuntu etc use different types of daemon/service managers.

Upstart

Service configurations are located in /etc/init/.

Enable a service on startup. There are two ways depending on OS.

chkconfig --add <service>

or

update-rc.d <service> default

You can then interact with the service using either initctl or service, again depending on OS. Sometimes both work, sometimes both work but show different services. Fun.

Initctl cheatsheet

Show services:

initctl list

Stop a service:

initctl stop <service>

Logs can sometimes be found in /var/log/upstart/<service>

Systemd

Simply put, service configurations are located in /etc/systemd/system/.

Enable a service on startup:

systemctl enable <service>

Interact with a given service.

systemctl start|status|restart|stop <service>

Although service probably also works.

Logs can usually be access via journalctl

journalctl -u <service>

A hacky way to disable express-jwt expiry for development

I'm developing a small app in my free time using nodejs and express, while trying out Auth0 as a way to outsource the authentication layer nicely. One problem I face though is as I want to test out my API, which I now typically do via Insomnia (great tool btw), I couldn't get a long lasting token to test the with.

So I figured out a quick dirty trick to fix this. Jumping through the calls through different node packages for JWT (express-jwt -> jsonwebtoken); I found a clockTimeStamp option.

clockTimestamp: the time in seconds that should be used as the current time for all necessary comparisons.

Simply set the clockTimeStamp value to some small non-zero integer. This will fake the time for the expiry check to think that it's sometime at the beginning of probably unix time or something.

Some example code / barebones app:

const express = require("express");
const jwt = require("express-jwt");
const jwks = require("jwks-rsa");
require("dotenv").config();

const app = express();

const jwtOptions = {
  secret: jwks.expressJwtSecret({
    cache: true,
    rateLimit: true,
    jwksRequestsPerMinute: 5,
    jwksUri: "https://some-domain.auth0.com/.well-known/jwks.json"
  }),
  audience: "http://localhost:3000/api/",
  issuer: "https://some-domain.auth0.com/",
  algorithms: ["RS256"]
};

// This effectively disables the expiry check
if (process.env["DISABLE_JWT_EXPIRY"]) {
  console.log("WARNING: set clockTimeStamp to 0");
  jwtOptions.clockTimestamp = 1;
}

let jwtCheck = jwt(jwtOptions);
app.use(express.json());
app.use(jwtCheck);

app.get("/userinfo", (req, res) => {
  res.json(req.user);
});

console.log("Starting on port 4000");
app.listen(4000);

Running it locally now with the DISABLE_JWT_EXPIRY environment variable set I can skip having to get a fresh token. Another alternative would ofc be to just use a different JWT setup for local development, but eh... too lazy :)

Of course, don't use this anywhere outside local development.

A small Golang webservice Dockerimage

Here is the Dockerfile for a small golang webservice I wrote, which I managed to make quite small. Saving it for myself and I guess I may as well share it since I think it will otherwise be thrown away.

Note I've replaced the more project-specific stuff. Like the project name and package. Probably there is some smarter way to compile it with the correct path.

# The file has two steps, first the builder then the actual running image.
# This part identifies it as the builder.
FROM golang:1.10 as builder

# Create appuser to avoid running as root later
RUN adduser --system appuser

# Import the project onto where it would go on the gopath
WORKDIR /go/src/github.com/<username>/<project>
COPY . .

# Fetch dependencies
RUN go get -d -v ./...
# Compile the binary. Mind the flags because it has to work in the next image.
RUN CGO_ENABLED=0 GOOS=linux GOARCH=amd64 go build -a -installsuffix cgo .

# Here we start the actual image. Scratch is super barebones.
FROM scratch

WORKDIR /

# Copy over the files we need.
# The ssl certs are needed for doing any kind of ssl connection
COPY --from=builder /etc/ssl/certs/ca-certificates.crt /etc/ssl/certs/
# The actual binary.
COPY --from=builder /go/src/github.com/<username>/<project>/<project> /<project>
COPY --from=builder /etc/passwd /etc/passwd

# Use the non-root user.
USER appuser

# The tcp port to expose, in my case 1323 which is the default of the framework I use.
EXPOSE 1323

# Start the binary as the entrypoint.
ENTRYPOINT ["/<project>"]

When building this Dockerfile for my project, I get an image that is only 13.1MB big. Pretty good, compared to other services in e.g. NodeJS or Python that I have made before that usually end up 300MB+.

IMAP Tools for migrating email accounts

I wrote some tools to help with migrating an existing imap account from one server to another, so that you can keep your emails when you switch your domain name to a new provider. I think normally your email client might do this for you automatically just by you changing the server settings, but I have some users only using a webmail client, meaning they don't have a local copy to sync to the new server. Plus it's always good to have a backup.

To solve this I wrote some scripts that let me download an entire mailbox and then reupload it to a new server. You can find them here: https://github.com/Tethik/imap-tools

I initially intended to use these for migrating a domain name and the email to a new provider. Unfortunately I never got to carry out the migration, so I only got to try this out with some test accounts. The way I imagine the whole process of moving email servers with a domain name looks something like this:

  1. Download all accounts and their emails locally using my script.
  2. Do the domain transfer.
  3. Set up the new imap accounts at the new provider.
  4. Use the upload script for each account to upload all old emails to the new accounts.

Tagging docker images differently based on git branch

At work we use git flow to organize our git repositories. master is the production branch, develop is the staging branch, and longer features get their own feature/* branch. On these branches, everything is built into docker images that are uploaded to a registry to later be deployed. git flow also gives us some basic version tagging that we want to use to track our releases.

I wanted to tag these docker images so that we could easier refer to them in our deployment. Where there would be a latest tag that would always be the latest version, and maybe a tag for every version.

I came up with a python script that wraps around docker and git to automatically generate these tags for me, with some basic customization. It's pretty rough, but you can find it here: https://github.com/Tethik/lame-cli-programs/tree/master/docker-branch-tagging

Usage

docker-branch-tagging init generates a default .docker-branch-tagging file that looks something like the following.

{
    "develop": ["latest","develop-{CIRCLE_BUILD_NUM}","{git_branch}"],
    "feature/(.+)": ["{git_branch}"],
    "master": ["master","master-{CIRCLE_BUILD_NUM}","{git_latest_version_tag}"]
}

The keys are regex, and the values are python format strings. The values get passed the current environment variables as well as two special case variables git_branch and git_latest_version_tag. The script will simply look for any keys matching the current git branch and perform the templating on the values to generate the different tags.

Doing a docker-branch-tagging build aws-blahabhla.com/example on the master branch would then result in something like the following.

docker build -t aws-blahabhla.com/example:master -t aws-blahabhla.com/example:master-123 -t aws-blahabhla.com/example:0.2.1 .

docker-branch-tagging push would then perform the docker push. Ungefär like so:

docker push aws-blahabhla.com/example:master 
docker push aws-blahabhla.com/example:master-123
docker push aws-blahabhla.com/example:0.2.1

On CircleCI, which is where we do our continous integration, the step for building and pushing the docker containers generally looks something like this now.

  build:
    docker:
      - image: circleci/python3
    steps:
      - run: sudo pip install awscli
      - run: sudo pip install "git+https://github.com/Tethik/lame-cli-programs#egg=docker_branch_tagging&subdirectory=docker-branch-tagging"
      - checkout
      - attach_workspace:
          at: .
      - setup_remote_docker
      - run: "docker login -u AWS -p $(aws ecr get-authorization-token --output text --query authorizationData[].authorizationToken | base64 --decode | cut -d: -f2) $DOCKER_REPOSITORY"
      - run: docker-branch-tagging build $DOCKER_REPOSITORY
      - run: docker-branch-tagging push $DOCKER_REPOSITORY

The attach_workspace is used to copy over whatever dependencies that may have been installed via e.g. npm or binaries/webpacks that may have been built in a previous workflow step. DOCKER_REPOSITORY is the environment variable I set in the project configuration to the AWS ECR uri.

CI/CD CV

Just for fun, I decided to try making a CI/CD pipeline for my CV. This post will be a bit rough and jumbled, sorry about that, probably would have been better to split this into several posts. Anyhow, here's how I did it.

Building the LateX file via Docker

Before I used to just update my CV via ShareLatex, so I didn't have any Latex packages installed on my system. Installing Latex is usually a confusing mess of packages, so I figured using docker might be a good fit. In addition I knew that CircleCI takes a docker image to launch it's jobs in, so I could reuse it later.

Luckily I found that someone else had made a docker image for xelatex. With some adjustments I made my own image that contained everything I needed for compiling my tex files.

Then I uploaded it to docker hub for free (since it's open source).

Setting up CircleCI

With the previously mentioned docker image I defined a CircleCI job step as follows.

  build_pdf:
    docker:
      - image: tethik/xelatex:latest
    steps:
      - checkout
      - run: make 
      - attach_workspace:
          at: .  
      - persist_to_workspace:
          root: .
          paths:
            - cv.pdf

The persist_to_workspace step saves the pdf so that it can be reused later in the deploy job using the attach_workspace step. The current attach_workspace step in the config above coming before the persist_to_workspace is to collect some scripted parts of the CV that I try to generate automatically. More on this later.

Deploying to Github Pages

For CD (Continuous Delivery) part of the process I needed somewhere to publish the document. I do have my own domains and servers, but I'd rather not have to give access to CircleCI to ssh or ftp into these. The hacky and cheap solution was to just reuse the same github repository and enable the Github pages feature. To do this I enabled Github Pages on the master branch, which I kept empty except for the final pdf output. Then I set up another step in the CircleCI config to commit and push the new pdf file to the master branch.

The CircleCI config step looks like this.

  deploy:
    docker:
      - image: circleci/node:8.9
    steps:
      - checkout
      - attach_workspace:
          at: .      
      - run: .circleci/deploy.sh

The deploy script itself looks like this.

#!/bin/sh

git config --global user.email "circleci@blacknode.se"
git config --global user.name "CircleCI Deployment"
mv cv.pdf ..
git fetch --all
git reset --hard origin/develop
git checkout master
mv ../cv.pdf .
git add cv.pdf
git commit -m "PDF build $CIRCLE_SHA1"
git push origin master

By default CircleCI generates a key on your github repo which only has READ access. To get around this I created a new ssh key and added it as a deploy key to the github repository. Then I removed the original key from the CircleCI project configuration, and added the new ssh key that I generated.

The final step was to add the redirect from my main homepage (this site) to the github pages link where the document would be hosted. Since the server is running Apache I could use the following config.

<IfModule mod_rewrite.c>
RewriteEngine On
RewriteRule ^(.)cv$ https://tethik.github.io/curriculum-vitae/cv.pdf
RewriteRule ^curriculum-vitae https://tethik.github.io/curriculum-vitae/cv.pdf
</IfModule>

It's a bit hacky, but it works :)

Automatically updating the document

The next fun step I wanted to do was add some parts to the document that would be automatically generated. Because writing CVs is boring and too manual. For now though I just coded something simple: A script that summarizes all my pull requests made on Github that are in some sense Open Source. i.e. not to repos that I own myself or repos that were created for e.g. schoolwork.

To organize the LateX files I set up the repo as follows. I wanted to keep the generated files separate from the main latex file that I copied from before.

partials/ ->  generated tex files.
src/github/ -> python script that generates into partials/
cv.tex

Inside the main cv.tex file I could then refer to the scripted content using the subfile package.

\subsection{Github Open Source Contributions}
\subfile{partials/pull_requests}

Inside the src/ folder I mean to keep scripts that generate the partials. The src/github/ folder contains a script that generates a pull_requests.tex into the partials/ folder. It talks the to Github GraphQL API and summarizes the info into a table.

I added the following CircleCI job config. This goes before the previously defined build_pdf step.

  build_partials:
    docker:
      - image: kennethreitz/pipenv
    steps:
      - checkout      
      - run: 
          command: pipenv install 
          working_directory: ~/project/src/github/
      - run: 
          command: pipenv run make
          working_directory: ~/project/src/github/
      - persist_to_workspace:
          root: .
          paths:
            - partials/

Again I persist_to_workspace to keep the resulting partial/*.tex files. Pipenv handles the python dependencies beautifully. The github api key and a blacklist of repos to ignore is passed through environment variables.

This is what the table looks like in the PDF.

The resulting table

In the future I'd like to add more content that's automatically generated. E.g. pypi and npm packages published, total github commit stats etc.

Final Result

You can find the repository here, the latest pdf built by the pipline here, and the CircleCI project here.

I still need to update my CV though.

Working locally with Docker containers

Over the weekend I wrote a small tool that will automatically update the /etc/hosts file with your running docker container. You can find the script here: https://github.com/Spielstunde/docker-hosts-update

Example usage

First create a docker network. This will allow the containers to connect to each other and automatically resolve hostnames. This is most of the magic tbh.

docker network create evilcorp.internal

Now you can start new containers in the network:

docker run --network evilcorp.internal --rm -it nginx
docker run --network evilcorp.internal --name hello --rm -it nginx

After starting the above two container you can then start ...

sudo docker-hosts-update

... and you should see a section like this in your /etc/hosts file.

# ! docker-hosts-update start !
# This section was automatically generated by docker-hosts-update
# Don't edit this part manually :)
172.20.0.2   hello.evilcorp.internal
172.20.0.3   friendly_golick.evilcorp.internal
# ! docker-hosts-update end   !

Because these networks are on the same bridged network, docker will first of all ensure that their hostnames resolve to each other. E.g. in the first container it will be able to connect to the "hello" container, either via hello.evilcorp.internal or just hello.

What my script does is easily enable the host machine to resolve the hostnames too. This saves you the trouble of either getting the ip manually (because hostname, duh) but also removes the need for other approaches where you need to assign different ports to different containers, and then use some sort of service discovery tool and proxy to manage them. On both host and container the url http://hello.evilcorp.internal will resolve to the correct container.

Finally when you stop the containers using Ctrl+C or docker stop hello, you'll see the lines automatically removed from the hosts file, if docker-hosts-update is still running.

Custom Application Launchers in Linux

Just a small howto for creating new application launchers in ubuntu, and I guess other similar distros. These will typically automatically show up in your start menu or launchers.

Simply create a <application>.desktop file in the ~/.local/share/applications/ or /usr/share/applications/ folder. <application> can be whatever.

At work I use the following desktop entry as a shortcut to open up an editor for our documentation. Here's an example file:

[Desktop Entry]
Encoding=UTF-8
Name=Documentation
Exec=code code/docs/
Icon=accessories-text-editor
Terminal=false
Type=Application
Categories=Development;

The following is copied for reference from https://help.ubuntu.com/community/UnityLaunchersAndDesktopFiles

Version is the version of this .desktop file.

Name is the name of the application, like 'VLC media player'.

Comment is a phrase or two describing what this program does, like 'Plays your music and videos files'.

Exec is the path to the executable file. The full path to the executable file must be used only in case it isn't in any of the paths specified in the $PATH variable. For example, any files that are inside the path /usr/bin don't need to have their full path specified in the Exec field, but only their filename.

Icon field is the icon that should be used by the launcher and represents the application. All icons that are under the directory /usr/share/pixmaps don't need to have their full path specified, but their filename without the extension. For example, if the icon file is /usr/share/pixmaps/wallch.png, then the Icon field should be just 'wallch'. All other icons should have their full path specified.

Terminal field specifies whether the application should run in a terminal window or not.

Type field specifies the type of the launcher file. The type can be Application, Link or Directory, but this article covers the 'Application' type.

Categories field specifies the category of the application. It is used by the Dash so as to categorize the applications.

Some other very good references: https://wiki.archlinux.org/index.php/Desktop_entries#Autostart https://specifications.freedesktop.org/desktop-entry-spec/latest/

Graphing the Ferryman Problem

Hello again, blog. It's been a while. I had a lot of topics that I wanted to write about the past year, but in the end I never managed to finish and publish anything. I think I'm generally ok with that though.

In the german lesson I had this week, we came across the "Ferryman problem" as part of an exercise. I found the following English explanation on the Mathswork website.

A man needs to cross a river with a wolf, a goat and a cabbage. His boat is only large enough to carry himself and one of his three possessions, so he must transport these items one at a time. However, if he leaves the wolf and the goat together unattended, then the wolf will eat the goat; similarly, if he leaves the goat and the cabbage together unattended, then the goat will eat the cabbage. How can the man get across safely with his three items?

During the lesson I then spent way too much time trying to draw a graph of all the different states. So instead I made a script using python and the graphviz library to draw this graph for me.

The different states are represented by a tuple of three. The leftmost value of the tuple is the set of who are on the initial (left) bank of the river. The second or middle value is the set for who are on the boat. The rightmost value is the destination bank of the river. Each character of the problem is identified by the letters. W is the wolf, S is the (Schafe), K is the cabbage (Kohl) and F is the ferryman.

Some example states:

('FKSW', '', '') # Our initial starting state with everyone the first riverbank.
('KW', 'FS', '') # The cabbage and the wolf are on the first bank. The ferryman and the sheep are on the boat in the river.

Anyhow, too much writing for something that is overkill. I'm pretty sure there are much simpler solutions to this. You can find the graph below.

graph