Thursday, July 27, 2023

Automating InfluxDB and Telegraf with Docker Compose

Automating is never as easy as you imagine.

InfluxDB is the most popular (as of July 2023) time-series database. It is often used alongside telegraf, an agent akin to fluent-bit. Getting these to work together with Docker Compose is fairly easy, but as soon as you want to automate the whole process, it gets painful.

Enter the world of shell arrays.

In my four years of programming, I'd never come across arrays in shell. How lucky I was. If you thought Perl syntax was weird, behold ${!arr[@]}. So why did I stumble across this nightmare? InfluxDB has a concept of buckets, a little bit like namespaces/schemas in PostgreSQL. Not quite another database, but more separated than tables. I wanted to have a number of buckets for different metrics telegraf is collecting:

If you're not familiar with telegraf, the above snippet is a conf file that will take measurements named applog and forward them into the applog bucket, and take cpu, disk, mem and system into metrics.In order for this to work, the buckets need to exist in InfluxDB. This is where the fun begins. Like many Docker images, the Influx image provides some useful environment variables. One useful one is DOCKER_INFLUXDB_INIT_BUCKET which lets you specify a bucket to be created on startup. Unfortunately it only lets you specify one, but alas do not fear, you can also mount a startup script: ./scripts/influx:/docker-entrypoint-initdb.d so you can create your buckets programatically:

# scripts/influx/init.sh
#!/bin/bash
set -e
echo Creating bucket: applog
influx bucket create -n applog
echo Creating bucket: metrics
influx bucket create -n metrics


But that's not very DRY. So what can we use? Ah, of course a loop and array:


# scripts/influx/init.sh
#!/bin/bash
set -e

BUCKETS=(
    'applog'
'metrics'
)
for i in "${!BUCKETS[@]}"
do
echo "$i" Creating bucket: "${BUCKETS[$i]}"
influx bucket create -n "${BUCKETS[$i]}"
done


So much DRYer! Now I can add buckets, to the array. No more copy-and-pasta. But, wait a second. Is this truly DRY? If I add another entry to outputs.conf I also have to remember to update the init.sh.


[[outputs.influxdb_v2]]
urls = ["http://$INFLUX_HOST:8086"]
token = "$INFLUX_TOKEN"
organization = "demo"
namepass = ["measurements"]
bucket = "measurements"


This won't cut it. Thankfully, getting the bucket names from outputs.conf isn't too bad:

We first tell the script where the config file is (by default bash file locations are relative to where the script was called, not the script itself). We then have to export this so that it can be accessed in our Docker container. Since we're exporting a variable, we need to source (not run) this script i.e. . compose_init.sh.

Another fun aside here: you probably, habitually add set -e to the top of your shell scripts, don't do that here because it will kill your terminal on any failure (since we sourced compose_init.sh).

We'll now change init.sh to use the BUCKETS environment variable, but remember BUCKETS is no longer an array (it's just a string as that's what grep outputs) so we need do some really obvious and intuitive stuff:

See ShellCheck: SC2207 if you want to understand mapfile.

Why didn't you just keep BUCKETS as an array in compose_init.sh? Simple answer: Docker doesn't deal with array environment variables very well.

And pass BUCKETS into the Influx service in docker-compose.yml:

And finally, in order to run this: 

. compose_init.sh && docker compose up -d

That's it. Automated bucket creation with telegraf and InfluxDB with Docker Compose.

See the full code gist here.

No comments:

Post a Comment