/ #prometheus #blackbox-exporter 

How to Monitor API Health with Blackbox Exporter and Prometheus

How to Monitor API Health with Blackbox Exporter and Prometheus

Ever had an API go down silently, only to realize it after users started complaining? I’ve been there—more times than I’d like to admit. That’s why I now rely on Prometheus and Blackbox Exporter to proactively monitor API health. In this guide, I’ll walk you through setting up Blackbox Exporter to probe endpoints, track latency, and alert you the moment something goes sideways.

Why Blackbox Exporter?

Blackbox Exporter is like having a dedicated API watchdog. It sends HTTP, TCP, or ICMP probes to your endpoints and reports back metrics like:

  • Uptime/downtime
  • Response latency
  • SSL certificate expiry

Pair it with Prometheus for scraping and Grafana for visualization, and you’ve got a robust monitoring system.


What You’ll Need

  1. Prometheus (already installed and running)
  2. Blackbox Exporter (download here)
  3. A target API endpoint to monitor (e.g., https://api.example.com/health)
  4. Basic familiarity with YAML configs (don’t worry, I’ll guide you).

Step 1: Install and Configure Blackbox Exporter

Installation

Run the following to download and extract Blackbox Exporter:

wget https://github.com/prometheus/blackbox_exporter/releases/download/v0.24.0/blackbox_exporter-0.24.0.linux-amd64.tar.gz
tar -xvf blackbox_exporter-0.24.0.linux-amd64.tar.gz
cd blackbox_exporter-0.24.0.linux-amd64

Configuration

Edit blackbox.yml to define your probes. Here’s a minimal HTTP check:

modules:
  http_2xx:
    prober: http
    http:
      preferred_ip_protocol: "ipv4"
      valid_status_codes: [200]
      no_follow_redirects: false

Start the exporter:

./blackbox_exporter --config.file=blackbox.yml

Verify it’s running by visiting http://localhost:9115 (the default port).


Step 2: Configure Prometheus to Scrape Blackbox

Add a job to prometheus.yml to scrape Blackbox’s metrics and probe your API:

scrape_configs:
  - job_name: 'blackbox'
    metrics_path: /probe
    params:
      module: [http_2xx]  # Use the module defined earlier
    static_configs:
      - targets:
        - https://api.example.com/health  # Your API endpoint
    relabel_configs:
      - source_labels: [__address__]
        target_label: __param_target
      - source_labels: [__param_target]
        target_label: instance
      - target_label: __address__
        replacement: localhost:9115  # Blackbox Exporter address

Restart Prometheus:

systemctl restart prometheus

Step 3: Set Up Alerts for Downtime or High Latency

Now, let’s alert on two critical scenarios:

  1. API is down (non-200 status).
  2. Latency exceeds 500ms.

Add these rules to your Prometheus alert manager (alert.rules.yml):

groups:
- name: api_health
  rules:
  - alert: APIUnavailable
    expr: probe_success{job="blackbox"} == 0
    for: 1m
    labels:
      severity: critical
    annotations:
      summary: "API is down (instance: {{ $labels.instance }})"
  
  - alert: HighLatency
    expr: probe_duration_seconds{job="blackbox"} > 0.5
    for: 5m
    labels:
      severity: warning
    annotations:
      summary: "High API latency ({{ $value }}s on {{ $labels.instance }})"

Step 4: Visualize Metrics in Grafana

Create a dashboard to track:

  • Uptime (probe_success)
  • Latency (probe_duration_seconds)
  • SSL expiry (probe_ssl_earliest_cert_expiry)

Here’s a sample Grafana query for latency:

avg(probe_duration_seconds{instance=~"$instance"}) by (instance)
Pro Tip: Use Grafana variables (like $instance) to make your dashboard dynamic.

Troubleshooting

Problem: Blackbox Exporter crashes on startup.
Fix: Check the YAML syntax—indentation matters! Use yamllint to validate.

Problem: Prometheus isn’t scraping metrics.
Fix: Verify the targets and relabel_configs in prometheus.yml.

Problem: Alerts aren’t firing.
Fix: Ensure Alertmanager is correctly configured to route alerts (e.g., to Slack or email).


FAQ

Q: Can Blackbox monitor non-HTTP endpoints?
A: Yes! It supports TCP, ICMP, and DNS probes—just tweak the modules in blackbox.yml.

Q: How do I monitor multiple APIs?
A: Add more targets under static_configs in Prometheus, or use service discovery.

Q: What’s the overhead of running Blackbox?
A: Minimal. A single instance can handle hundreds of probes.


Next Steps

  • Monitor internal services (e.g., databases, MQTT brokers).
  • Set up synthetic checks for user journeys.
  • Integrate with Grafana alerts for richer notifications.

For more on observability, check out my posts on Grafana alerting and Zigbee2MQTT monitoring.

Now go forth and never be blindsided by API downtime again! 🚀