Thoughts, Notes & Musings: July 2023

It's not unknown for Microsoft to have duplicate products. Just think of Windows 8, Windows RT, Windows Phone, Teams and Skype, Project and Planner. The list goes on. So the fact that Microsoft has two rather similar code storage and management tools in GitHub and Azure DevOps shouldn't surprise us. But how long can this charade last?

I was curious to find out as it looks like a project I am working on might be moving to Azure DevOps. I did some research and couldn't find much beyond speculation, however one juicy source I found is from the Episode 321 of the Azure Podcast Episode in which they interview Sasha Rosenbaum, a Senior PM from the GitHub team. In the podcast she says (around 10:30-12:00):

"We can’t effectively run two products and have internal competition between two things so we are going to move towards having one in the end.

“GitHub is the future, [it is] much better positioned to accomplish certain things

“If you are in Azure DevOps now, you probably have five years (emphasis mine) that you can safely continue working in Azure DevOps.

If you’re starting out, check out GitHub first because that’s where we’re going to make investments mostly."

This episode is from March 2020 so that five years is now less than two.

This might be terrifying news for any of you currently working in Azure DevOps and whilst I don't imagine the transition will be painless I think it absolutely will and should happen.

But what does what one PM says count for? After all she no longer even works at Microsoft. Well, for one, looking at her LinkedIn biography, she was actually originally on the Microsoft Azure DevOps team so the fact that she moved onto the GitHub team may tell us something. Furthermore, the sentiment is widespread in the Azure DevOps community that slowly this product will be phased out. Try to lookup "GitHub actions to Azure DevOps pipeline" and you'll see which way the wind is blowing.

And despite the evidence, the truth is that if Microsoft want to streamline their offering, which I think is safe to assume, and the choice comes between GitHub and Azure DevOps. Microsoft would have a much harder time trying to carry people over to Azure DevOps. GitHub Enterprise is the future of Azure DevOps.

Automating is never as easy as you imagine.

InfluxDB is the most popular (as of July 2023) time-series database. It is often used alongside telegraf, an agent akin to fluent-bit. Getting these to work together with Docker Compose is fairly easy, but as soon as you want to automate the whole process, it gets painful.

Enter the world of shell arrays.

In my four years of programming, I'd never come across arrays in shell. How lucky I was. If you thought Perl syntax was weird, behold ${!arr[@]}. So why did I stumble across this nightmare? InfluxDB has a concept of buckets, a little bit like namespaces/schemas in PostgreSQL. Not quite another database, but more separated than tables. I wanted to have a number of buckets for different metrics telegraf is collecting:

If you're not familiar with telegraf, the above snippet is a conf file that will take measurements named applog and forward them into the applog bucket, and take cpu, disk, mem and system into metrics.In order for this to work, the buckets need to exist in InfluxDB. This is where the fun begins. Like many Docker images, the Influx image provides some useful environment variables. One useful one is DOCKER_INFLUXDB_INIT_BUCKET which lets you specify a bucket to be created on startup. Unfortunately it only lets you specify one, but alas do not fear, you can also mount a startup script: ./scripts/influx:/docker-entrypoint-initdb.d so you can create your buckets programatically:

# scripts/influx/init.sh
#!/bin/bash
set -e
echo Creating bucket: applog
influx bucket create -n applog
echo Creating bucket: metrics
influx bucket create -n metrics

But that's not very DRY. So what can we use? Ah, of course a loop and array:

# scripts/influx/init.sh
#!/bin/bash
set -e

BUCKETS=(
    'applog'
    'metrics'
)
 
for i in "${!BUCKETS[@]}"
do
    echo "$i" Creating bucket: "${BUCKETS[$i]}"
    influx bucket create -n "${BUCKETS[$i]}"
done

So much DRYer! Now I can add buckets, to the array. No more copy-and-pasta. But, wait a second. Is this truly DRY? If I add another entry to outputs.conf I also have to remember to update the init.sh.

[[outputs.influxdb_v2]]
  urls = ["http://$INFLUX_HOST:8086"]
  token = "$INFLUX_TOKEN"
  organization = "demo"
  namepass = ["measurements"]
  bucket = "measurements"

This won't cut it. Thankfully, getting the bucket names from outputs.conf isn't too bad:

We first tell the script where the config file is (by default bash file locations are relative to where the script was called, not the script itself). We then have to export this so that it can be accessed in our Docker container. Since we're exporting a variable, we need to source (not run) this script i.e. . compose_init.sh.

Another fun aside here: you probably, habitually add set -e to the top of your shell scripts, don't do that here because it will kill your terminal on any failure (since we sourced compose_init.sh).

We'll now change init.sh to use the BUCKETS environment variable, but remember BUCKETS is no longer an array (it's just a string as that's what grep outputs) so we need do some really obvious and intuitive stuff:

See ShellCheck: SC2207 if you want to understand mapfile.

Why didn't you just keep BUCKETS as an array in compose_init.sh? Simple answer: Docker doesn't deal with array environment variables very well.

And pass BUCKETS into the Influx service in docker-compose.yml:

And finally, in order to run this:

. compose_init.sh && docker compose up -d

That's it. Automated bucket creation with telegraf and InfluxDB with Docker Compose.

See the full code gist here.

Thoughts, Notes & Musings

Friday, July 28, 2023

Does Azure DevOps have a Future?

Thursday, July 27, 2023

Automating InfluxDB and Telegraf with Docker Compose