August 26, 2019 •
written by:
Prometheus and Grafana team up nicely to collect and visualize various monitoring data. But at times you need to monitor also entities that do not provide proper monitoring capabilities by themselves or you do only have limited access to. Quickly setting up a Prometheus exporter from those existing building blocks would come in handy. Here‘s how to do this.
This article is going to demonstrate how to assemble such an Prometheus exporter from an existing tool. As example we use a remote backup space that we want to monitor. Unfortunately, the only access given to us, is via SSH / SFTP. Let‘s work it from here.
So, we can SSH to the remote machine and use the disc free df
tool, to check the remaining disc space. All of this can be done in combination from a single line. This command can be issued from a shell script (we'll call it check_backup_space.sh
for further reference) from the local machine:
#!/bin/bash
echo "df" | sftp backup > /data/metric.txt
The command will send a df
command via SFTP to the remote machine and write the output to a file /data/metric.txt
. The result may look like this:
sftp> df
Size Used Avail (root) %Capacity
104857600 2092627 102764972 102764972 1%
Alright, this way we can pull the information from the remote to the local machine and write it to a text file. Now that the information is right at our toes, we need a Prometheus exporter to read it - and "understand" it.
Those of you who have used Logstash might have stumbled upon Grok, already. Grok is used in Logstash to parse values from log-lines using regular expressions. However, there is a neat little application, which makes use of Grok and combines it with a Prometheus metrics server: Grok Exporter.
Grok Exporter is configured using a yaml config file, which looks like this:
global:
config_version: 2
input:
type: file # defines where the input comes from
path: /data/metric.txt # sets the location of the data
readall: true # true means, we want the app to read the complete file, each time.
grok:
additional_patterns: # the exporter brings a ton of already existing patterns, we want to use
- 'BYTENUMBER ([0-9])+' # own patterns, thus we define them here.
- 'SPACE (\s)+'
metrics:
- type: gauge
name: backup_space_percentage
help: Percentage used on backup space
match: '%{SPACE}%{BYTENUMBER}%{SPACE}%{BYTENUMBER}%{SPACE}%{BYTENUMBER}%{SPACE}%{BYTENUMBER}%{SPACE}%{BYTENUMBER:percentage}'
value: '{{.percentage}}'
cumulative: false
server:
host: 0.0.0.0 # Host IP of the metrics server
port: 9144 # Port of the metrics server
The most interesting part of the config, is the metrics
part. Here you can define one or more metrics you want to gather.
Prometheus supports different metric types, like gauge, counter and histogram. In case of the backup space, the percentage used may go up and down arbitrarily, so we'll pick a gauge metric type. Thus, we set the type
to gauge
. To read more about the metric types, check the metric types section of the Prometheus documentation.
The help
entry specifies a description of the metric. It will be returned by the metrics endpoint if you access it.
The match
entry is the most interesting one. Here, we'll define how a input line has to look like, so the Grok Exporter will accept it and parse values from it. If this format does not match your input line, it will be discarded. Notice, that in the %{BYTENUMBER:size}
part, BYTENUMBER
refers to a named RegEx and size
is a variable name the value will be assigned to. This way you define how an input line should look like and also read specific values from it. In the value
part, the value of the gauge is set. Here, we refer to the variable we defined the match
part. The notation with the curly braces is Go Templates, we're just returning the parsed value.
Finally, we set the metric to not be cumulative.
For more details of the on the Grok Exporter config file, have a look at the example config documentation. Also, Grok has a ton of predefined regular expressions which you can use right out of the box. Check the documentation for those, too.
If you have values (e.g. IP address, host name, type, ...) which you want to use in Prometheus to filter results, you can add labels
to your metrics, simply add them like this:
labels:
type: '{{.type}}'
hostname: '{{.hostname}}'
Finally, you need to run Grok Exporter, this is as simple as ./grok_exporter -config ./config.yml
, where ./config.yml
points to the configuration described above.
The only thing missing now is something that updates the metrics.txt
file regularly. The most common tool for this kind of work is a cron job. How to start a cron job varies a bit across the operating systems. On Ubuntu you may do something along the lines of ...
echo '* */3 * * * /path/to/check_backup_space.sh' > /etc/cron.d/check_backup_space_cron
Which creates a cron job file, that instructs the cron job system to execute the check_backup_space.sh
script every three hours. The check_backup_space.sh
script is the script from above, which writes the results of df
to the metrics.txt
file.
Now the job needs to be added to the job system:
crontab /etc/cron.d/check_backup_space_cron
Done. Now every three hours the job will run, overwriting the metrics.txt
file, which will be read by the Grok Exporter. If and only if a line matches the line specified in the Grok Exporter config, values will be extracted and returned through the metrics endpoint.
To see whether the exporter is working, open your browser and navigate to http://127.0.0.1:9144/metrics. Update the host and port if you changed those or the server is running remotely.
In Prometheus you should now find backup_space_percentage
as well as additional helpful metrics, which give some insights in how well the grok exporter is working, for example grok_exporter_lines_matching_total
, grok_exporter_line_processing_errors_total
and others.
Now your metric is ready to be integrated into your Grafana dashboard.
There is one downside to this approach. It looks like, df
does not return the exact bytes free or used of the backup space, but rather kilo bytes. One way around this is, do use df -h
, which provides a human readable variation of the disc space. We decided we wanted to see a higher resolution rather than just knowing there is still some number of gigabytes left.