Building A Simple Prometheus Exporter

August 26, 2019

written by:

Prometheus and Grafana team up nicely to collect and visualize various monitoring data. But at times you need to monitor also entities that do not provide proper monitoring capabilities by themselves or you do only have limited access to. Quickly setting up a Prometheus exporter from those existing building blocks would come in handy. Here‘s how to do this.

flavor wheel

This article is going to demonstrate how to assemble such an Prometheus exporter from an existing tool. As example we use a remote backup space that we want to monitor. Unfortunately, the only access given to us, is via SSH / SFTP. Let‘s work it from here.


So, we can SSH to the remote machine and use the disc free df tool, to check the remaining disc space. All of this can be done in combination from a single line. This command can be issued from a shell script (we'll call it for further reference) from the local machine:

echo "df" | sftp backup > /data/metric.txt

The command will send a df command via SFTP to the remote machine and write the output to a file /data/metric.txt. The result may look like this:

sftp> df
        Size         Used        Avail       (root)    %Capacity
   104857600      2092627    102764972    102764972           1%

Alright, this way we can pull the information from the remote to the local machine and write it to a text file. Now that the information is right at our toes, we need a Prometheus exporter to read it - and "understand" it.

Grok Exporter

Those of you who have used Logstash might have stumbled upon Grok, already. Grok is used in Logstash to parse values from log-lines using regular expressions. However, there is a neat little application, which makes use of Grok and combines it with a Prometheus metrics server: Grok Exporter.

Grok Exporter is configured using a yaml config file, which looks like this:

    config_version: 2
    type: file                # defines where the input comes from
    path: /data/metric.txt    # sets the location of the data
    readall: true             # true means, we want the app to read the complete file, each time.
    additional_patterns:      # the exporter brings a ton of already existing patterns, we want to use
      - 'BYTENUMBER ([0-9])+' # own patterns, thus we define them here.
      - 'SPACE (\s)+'
    - type: gauge           
      name: backup_space_percentage
      help: Percentage used on backup space
      value: '{{.percentage}}'           
      cumulative: false
    host:             # Host IP of the metrics server
    port: 9144                # Port of the metrics server

The most interesting part of the config, is the metrics part. Here you can define one or more metrics you want to gather.

Prometheus supports different metric types, like gauge, counter and histogram. In case of the backup space, the percentage used may go up and down arbitrarily, so we'll pick a gauge metric type. Thus, we set the type to gauge. To read more about the metric types, check the metric types section of the Prometheus documentation.

The help entry specifies a description of the metric. It will be returned by the metrics endpoint if you access it.

The match entry is the most interesting one. Here, we'll define how a input line has to look like, so the Grok Exporter will accept it and parse values from it. If this format does not match your input line, it will be discarded. Notice, that in the %{BYTENUMBER:size} part, BYTENUMBER refers to a named RegEx and size is a variable name the value will be assigned to. This way you define how an input line should look like and also read specific values from it. In the value part, the value of the gauge is set. Here, we refer to the variable we defined the match part. The notation with the curly braces is Go Templates, we're just returning the parsed value.

Finally, we set the metric to not be cumulative.

For more details of the on the Grok Exporter config file, have a look at the example config documentation. Also, Grok has a ton of predefined regular expressions which you can use right out of the box. Check the documentation for those, too.

If you have values (e.g. IP address, host name, type, ...) which you want to use in Prometheus to filter results, you can add labels to your metrics, simply add them like this:

        type: '{{.type}}'
        hostname: '{{.hostname}}'

Finally, you need to run Grok Exporter, this is as simple as ./grok_exporter -config ./config.yml, where ./config.yml points to the configuration described above.

Updating the Data

The only thing missing now is something that updates the metrics.txt file regularly. The most common tool for this kind of work is a cron job. How to start a cron job varies a bit across the operating systems. On Ubuntu you may do something along the lines of ...

echo '*  */3  *  *  *    /path/to/' > /etc/cron.d/check_backup_space_cron

Which creates a cron job file, that instructs the cron job system to execute the script every three hours. The script is the script from above, which writes the results of df to the metrics.txt file.

Now the job needs to be added to the job system:

crontab /etc/cron.d/check_backup_space_cron

Done. Now every three hours the job will run, overwriting the metrics.txt file, which will be read by the Grok Exporter. If and only if a line matches the line specified in the Grok Exporter config, values will be extracted and returned through the metrics endpoint.

Accessing the Metric

To see whether the exporter is working, open your browser and navigate to Update the host and port if you changed those or the server is running remotely.

In Prometheus you should now find backup_space_percentage as well as additional helpful metrics, which give some insights in how well the grok exporter is working, for example grok_exporter_lines_matching_total, grok_exporter_line_processing_errors_total and others.

Now your metric is ready to be integrated into your Grafana dashboard.


There is one downside to this approach. It looks like, df does not return the exact bytes free or used of the backup space, but rather kilo bytes. One way around this is, do use df -h, which provides a human readable variation of the disc space. We decided we wanted to see a higher resolution rather than just knowing there is still some number of gigabytes left.