Monitoring StackStorm with Prometheus and Grafana

StackStorm is an open source automation platform that allows you to integrate and automate workflows across your infrastructure. Monitoring StackStorm and its dependencies, such as MongoDB, RabbitMQ, and system metrics, is crucial for ensuring its availability, performance, and reliability. In this blog post, we will explore how to monitor StackStorm using Prometheus and Grafana, along with exporters for various components.

Blackbox Exporter is a tool provided by Prometheus that allows you to probe endpoints over various protocols, such as ICMP and HTTP. It helps monitor connectivity and network/DNS issues. We can set up Blackbox Exporter to probe StackStorm instances, MongoDB, RabbitMQ, and other services. To set up Blackbox Exporter for your instances, follow the guide on the Blackbox Exporter GitHub page.

Currently, there are two available probes: http_2xx and icmp. Both probes have been adapted from examples provided by Prometheus. Additional probe types can be found in the Blackbox Exporter’s GitHub repository example file. Once the Blackbox exporter is set up, you can use the example below to set up Prometheus targets to monitor ICMP and HTTP endpoints in the prometheus.yaml file:

“`html
# icmp module with ping connectivity test
– job_name: blackbox-ping
honor_timestamps: true
params:
module:
– icmp
scrape_interval: 15s
scrape_timeout: 10s
metrics_path: /probe
scheme: http
relabel_configs:
– source_labels: [__address__]
separator: ;
regex: (.*)
target_label: __param_target
replacement: $1
action: replace
– separator: ;
regex: (.*)
target_label: __address__
replacement: 0.0.0.0:9115
action: replace

# http_2xx module with http connectivity test
– job_name: blackbox-http
honor_timestamps: true
params:
module:
– http_2xx
scrape_interval: 15s
scrape_timeout: 10s
metrics_path: /probe
scheme: http
relabel_configs:
– source_labels: [__address__]
separator: ;
regex: (.*)
target_label: __param_target
replacement: $1
action: replace
– separator: ;
regex: (.*)
target_label: __address__
replacement: 0.0.0.0:9115
action: replace
“`

Monitoring MongoDB using MongoDB Exporter is essential for gaining insights into the performance and health of your MongoDB database. MongoDB Exporter is a Prometheus exporter that collects and exposes MongoDB metrics in a format compatible with Prometheus. To set up MongoDB Exporter, check out the MongoDB Exporter repository on GitHub. You can install MongoDB Exporter and configure Prometheus to scrape metrics from it.

Node Exporter is a Prometheus exporter for system metrics such as CPU usage, memory usage, disk I/O, and network statistics. It provides insights into the health and performance of the underlying host where StackStorm is running. To set up Node Exporter, follow the Node Exporter GitHub page. Once Node Exporter is set up, you can configure Prometheus to collect system metrics from each StackStorm server.

StatsD Exporter allows you to collect and export custom metrics from StackStorm. It maps StackStorm metrics to Prometheus-compatible metrics using a mapping file. To set up StatsD Exporter, check out the StatsD Exporter repository on GitHub. You can configure StackStorm to send metrics to StatsD, and then StatsD Exporter will translate those metrics into Prometheus metrics.

Monitoring the Services Core StackStorm services like (auth, stream, api) can be monitored via an http monitoring process using Prometheus and Grafana. Other services that support StackStorm, such as st2scheduler, can be monitored by exposing their endpoints using a process monitoring tool. The endpoints for these services are reachable from a web browser, but you may not have proper authentication. By checking the HTTP responses of these endpoints, you can paint a picture of the states of these services.

To collect the data, you can use blackbox-exporter to send HTTP requests to the endpoints of the StackStorm services. By configuring Prometheus to scrape metrics from blackbox-exporter, you can visualize the states of the services in Grafana.

In conclusion, monitoring StackStorm and its dependencies is crucial for ensuring the reliability and performance of your automation workflows. By leveraging Prometheus and Grafana along with exporters for different components, you can gain valuable insights into the health and behavior of your StackStorm environment. With proactive monitoring and alerting, you can quickly identify and address any issues before they impact your operations.

Source link