IOT produce a lot of data. This tutorial will explain how to move, format, store and explore all this data with state of the art technologies.
The stack we are going to use is MQTT, Telegraf, InfluxDB and Grafana. In top of that, everything will be contenerised thanks to Docker. Below is a little architecture schema that show whare we are going to build.
Note, this tutorial will simply explain how to setup this tech stack. It will not explain how each software work in depth. Also, all the code for this tutorial can be found on Github Gist.
First step, Docker and Docker Compose
The current industry trend is contenerise everything. IOT is not an exception, and as such we will setup this complete stack with containers.
Docker is the industry standard when it comes to containers. Docker Compose is a tool to manage multiple containers. To follow up with this tutorial you will need Docker and Docker Compose installed.
- Install Docker: https://docs.docker.com/engine/install/
- Install Docker-Compose: https://docs.docker.com/compose/install/
For this tutorial we will need to work with multiples container exchanging data. We will need them to be on the same Docker network. Let’s create a very simple one named iot:
docker network create iot
Getting started with MQTT
MQTT which means Message Queuing Telemetry Transport is an open ISO standard, publish/subscribe network protocol. It is widely used in IOT for it’s very low network footprint and simplicity of usage.
For this tutorial I choose Eclipse Mosquitto as an implement of the MQTT protocol since it’s open source and have an easy to use Docker image. Let’s pull the official Docker image located at https://hub.docker.com/_/eclipse-mosquitto
docker pull eclipse-mosquitto
Now, lets try to run the container based on this image:
docker run -it --rm \
--network=iot \
-p 1883:1883 -p 9001:9001 \
--name mosquitto \
eclipse-mosquitto
This command will tell Docker to run the eclipse-mosquitto image in an interactive way (-it) and forward the ports 1883 and 9001 to the one on the container, since Mosquitto require them to work. We also ask Docker to auto remove (-rm) the container once it has terminate. The container will be started on the iot network thanks to the (-network) option. It also tell Docker to name this container mosquitto. Once run, the Mosquitto server should tell you that it is now listening on port 1883.
Let’s try to send a message to the Mosquitto server. For this we are going to use the mosquitto_pub command already installed in the Docker container. Execute the following command in another terminal:
docker container exec mosquitto mosquitto_pub \
-t 'bedroom/temperature' \
-m '20'
This command ask the previously started container to execute the mosquitto_pub command on the topic (-t) bedroom/temperature to send the message (-m) 20. On the server you should see that a client connected and disconnected.
An advantage of MQTT is to have the ability to send any kind of format of data as a message. It means that we could send JSON:
docker container exec mosquitto mosquitto_pub \
-t 'bedroom/sensors' \
-m '{"degrees": 20, "powerConsumption": 2}'
We have now setup our MQTT server and a way to send IOT data. Keep the Mosquitto server running for now and let’s see how to store MQTT data passing trough.
Introducting InfluxDB at the right time
InfluxDB is a time series database. It means, that every data stored by this database will have a time field. InfluxDB is open source and very often used in IOT to store telemetry. It’s very powerfull and can handle a large amount of data. InfluxDB main use case is for handling large amount of timestamped data. Ideal for IOT!
InfluxDB also have an official Docker image that we are going to use https://hub.docker.com/_/influxdb. Let’s pull it:
docker pull influxdb:1.8
As we can find on the documentation, let’s run the image:
docker run \
--network=iot \
-v $PWD/influxdb-data:/var/lib/influxdb \
--name influxdb \
influxdb
This command will run the influxdb Docker image in a new container. This container will run on the same network as the Mosquitto one (iot) and will have it’s port (-p) 8086 mapped to the one on your OS. This port is required by InfluxDB for HTTP API access. Then, to make InfluxDB data persistent between InfluxDB images run we bind an OS volume to the one in your container (-v). The $PWD is the path on your OS, so in this cas your current working directory. Finally, we name this container influxdb.
The Telegraf agent
We are going to see how to pass our data from MQTT to InfluxDB. This is a job perfectly suited for Telegraf. Telegraph is an open source, plugin-driven agent. It connect to multiple inputs and write to multiple output. It can apply data transformations to it’s input.
Telegraf have an official Docker image here: https://hub.docker.com/_/telegraf. Let’s first pull it:
docker pull telegraf
Telegraph can handle many different input and output plugin. For our use case we are going to see the MQTT input plugin and the InfluxDB and file output plugin. You can find the list of all Telegraph plugins here.
To start with the Telegraf configuration we have to create a telegraf.conf file. We can either start from scratch or generate a big one with the following command:
docker run --rm telegraf telegraf config > telegraf.conf
This command ask Telegraf to output it’s config to the telegraf.conf file on the host OS. Now we are going to edit this file to add input and output plugins. Open the telegraf.conf file, don’t be afraid of it’s really big size. All available plugins config are commented inside the file.
But for this tutorial we will start from a blank telegraf.conf file, for learning purpose. So, delete the generated file and create a new empty one. Paste the following, minimal, code inside:
This file enable two new ouputs, the InfluxDB and file one. We pass an array of urls InfluxDB should connect to. The file ouput will write everything that Telegraf receive by it’s input to both stdout and a file located at /tmp/metrics.out. Writing to both this location will help debuging.
Let’s talk about the input, the mqtt_consummer, we first specify to Telegraf how to connect to the Mosquitto server. Then, we also tell Telegraf to which topics it must subscribe, in this case, every MQTT topics under bedroom.
Since the Telegraf container will run on the same Docker network as InfluxDB and Mosquitto we can use their container name instead of their container ip in their adress.
Let’s test that everything is working. If you closed your Mosquitto or InfluxDB container restart them using previously used command. We are now going to first start the Telegraf container:
docker run --rm \
--network=iot \
-v $PWD/telegraf.conf:/etc/telegraf/telegraf.conf:ro \
--name telegraf \
telegraf
This will start Telegraf mouting the telegraf.conf file from your host system to the container. If everything is OK, Telegraf logs should tell you that it has loaded an mqtt_consummer input and a file and influxdb output. It also should that it has connected to the MQTT on tcp://mosquitto:1883. Now that our setup is complete let’s send a MQTT message and see that it works. From a new terminal send a MQTT message:
docker container exec mosquitto mosquitto_pub \
-t 'bedroom/temperature' \
-m 'bedroom_temperature celsius=20'
The message (-m) must match the influx format, which is:
measure_name field1=value1, field2=value2
With measure_name being something to identify your measure. After running the previous Docker command you see that Telegraf has printed some log:
bedroom_temperature,host=0c4fc13ca77a,topic=bedroom/temperature celsius=20 1606126188330968846
This means that the data was correctly writed to InfluxDB. To be sure about that we are going to connect to InfluxDB and see it’s measures. First, let’s get into the container and run the influx command:
docker exec -it influxdb influx
We are now in the InfluxDB shell. We first need to tell InfluxDB which database to use. In our case, a default one was created by Telegraf, let’s use it. In the InfluxDB shell write:
use telegraf
Now, let’s see all the series inside the database. In InfluxDB a serie is a logical group of data defined by shared measurement, tag set, and field key. Let’s see the one in your database:
show series
The previous query will print all the data in your telegraf database. You should now see the data you have sent trough MQTT. Since our data is now stocked, let’s explore it in Grafana!
Explore our data with Grafana
Grafana is a web app tool to query, visualize and create alerts about metrics, no matter where they are stored. It help create easy to use dashboard. It’s used by big tech companies such as Ebay, Paypal, Intel, etc. Grafana can run inside a Docker container. Let’s pull the Docker image located at https://hub.docker.com/r/grafana/grafana/
docker pull grafana/grafana
It’s detailled running instructions can be found here. But we will go the simple way. Open a new terminal and run:
docker run -d \
--network=iot \
-p 3000:3000 \
-v $PWD/grafana-data:/var/lib/grafana \
--name grafana \
grafana/grafana
The -d option will ask Docker to run the container in detached mode, meaning that after starting the container Docker will let you take back the control of the terminal while keeping the container run in background. You can check that the container is running by listing all running containers:
docker container ls
Now that Grafana started it is listening on http://localhost:3000. Access it with your navigator. The default login infos are: admin / admin. Login and change your password if needed.
Let’s connect Grafana to InfluxDB. Go to http://localhost:3000/datasources. Add a new data source. In the data source providers options choose InfluxDB. Now, in the form, for the HTTP URL field write: http://influxdb:8086, the InfluxDB location on our Docker iot network. Scroll down, in InfluxDB details, in the database field write: telegraf. Save & test the data source, Grafana should tell you that the data source is working. Houra!
Now let’s create a new Grafana dashboard. Head to http://localhost:3000/dashboard/new. From here click the Add new panel button. At the bottom of the screen you will find the query editor. Edit the query:
SELECT last("celsius") FROM "bedroom_temperature" WHERE $timeFilter GROUP BY time($__interval) fill(previous)
Use the query you want to view your data. And that’s it. You can now query your InfluxDB database from Grafana. Querying InfluxDB is out of the scope of this article and as such I recommend that you read the documentation.
Wrap it up with Docker Compose
Running all thoses container one by one is a bit tedious. Let me introduce you Docker Compose. Docker Compose is a tool for defining and running multi-container Docker applications. With Compose, you use a YAML file to configure your application’s services. We will create a Docker Compose that will start all our services. The Docker Compose reference can be found here : https://docs.docker.com/compose/compose-file/. Below are the two files you will need to run the stack we talked about during this tutorial.
To run the stack, in the folder where thoses two files are, run the following command:
docker-compose up -d
This will start all Docker Compose service in detached mode (-d). You should now be able to go to Grafana on http://localhost:3000. Also, you should be able to send MQTT message with:
docker container exec mosquitto mosquitto_pub \
-t 'bedroom/temperature' \
-m 'bedroom_temperature celsius=20'
Replace mosquitto with the name of your running container, which should be something like folder_name_mosquitto_1. If you want to stop all your containers:
docker-compose stop
And that’s it for this tutorial. We have seen how to setup a complete tech stack for handling a large amount of data using state of the art technologies.