Prometheus Instrumentation

13 Sept. 2018

What Is Prometheus?

Metrics-Based Monitoring

  • Unified System for Metrics and Monitoring
  • Pull-Based (Scrapes Targets)
  • Integrates Seamlessly with Grafana for Graphs
  • Includes a Powerful Expression Language
  • Supports Multidimensional Metrics
  • Alerts Based on Metrics
  • Instrumentation Libraries for White Box Monitoring

Metrics Scraping

  • Metrics Are Scraped Over HTTP
  • Uses Service Discovery to Find Targets
  • Simple Text-Based Format
chaumes@prometheus-901:~$ curl -sS localhost:9100/metrics | grep node_filesystem_avail

# HELP node_filesystem_avail Filesystem space available to non-root users in bytes.
# TYPE node_filesystem_avail gauge
node_filesystem_avail{device="/dev/sda1",fstype="ext4",mountpoint="/"} 2.3620919296e+10
node_filesystem_avail{device="none",fstype="tmpfs",mountpoint="/run/lock"} 5.24288e+06
node_filesystem_avail{device="none",fstype="tmpfs",mountpoint="/run/shm"} 5.20429568e+08
node_filesystem_avail{device="none",fstype="tmpfs",mountpoint="/run/user"} 1.048576e+08
node_filesystem_avail{device="rpc_pipefs",fstype="rpc_pipefs",mountpoint="/run/rpc_pipefs"} 0
node_filesystem_avail{device="srv_salt",fstype="vboxsf",mountpoint="/srv/salt"} 4.1484754944e+11
node_filesystem_avail{device="tmpfs",fstype="tmpfs",mountpoint="/run"} 1.03616512e+08
node_filesystem_avail{device="vagrant",fstype="vboxsf",mountpoint="/vagrant"} 4.1484754944e+11

Instrumentation Libraries

Metric Types

Counter

  • Represents a Cumulative Numerical Value
  • Monotonically Increases
    • e.g. the value can never go down or reset
  • Useful for
    • number of requests served
    • tasks completed
    • number of errors

Gauge

  • Represents a Single Numerical Value
  • Can Increase or Decrease Arbitrarily
  • Useful for
    • memory or CPU cycles used
    • number of threads or processes
    • number of tasks (e.g. in a queue)
    • number of objects (e.g. in a database)

Histogram

  • Samples Observations in Configurable Buckets
  • Cumulative Across Buckets
  • Exposes Multiple Time Series
    • cumulative counters for the observation buckets
    • total sum of all observed values
    • count of events observed
  • Useful for
    • Measuring Latencies/Response Times by Quantile
    • Approximating Apdex Scores

Summary

  • Similar to a Histogram
  • Calculates Configurable Quantiles Over a Sliding Time Window
  • Cannot Be Aggregated (e.g. among multiple instances)
  • Exposes Multiple Time Series
    • streaming quantiles of observed events
    • total sum of all observed values
    • count of observed events
  • Useful for
    • similar metrics as histograms

Histogram or Summary?

  • It's Complicated!
  • Read Docs and Seek Guidance
  • Guidelines Distilled
    • If you need to aggregate, use Histogram
    • If you have an idea of the range and distribution of values that will be observed, use Histogram
    • If you need an accurate quantile, regardless of the range and distribution of values, use Summary

Service Types

Online

  • Human or System Expects an Immediate Response
  • White Box Instrumentation Helps Diagnose Where a Problem Lies
  • Key Metrics
    • number of performed queries (counter)
    • number of errors/exceptions (counter)
    • latency (histogram or summary)
  • Pro Tip: Count Queries When They *END*

Offline

  • Continually Running, but Nothing Awaits Response
  • Key Metrics
    • Items In (counter)
    • Items in Progress (gauge)
    • Items Out (counter)
    • Items Sent (gauge)
  • Pro Tip: Use a Heartbeat to Expose Processing Time

Batch

  • Like an Offline Service, but Not Continually Running
  • Cannot Be Scraped (Must Use Push Gateway)
  • Key Metrics
    • UNIX Timestamp of Last Successful Run (gauge)
    • UNIX Timestamp of Last Failed Run (gauge)
    • Duration of Each Processing Stage (gauge)
    • Overall Runtime (gauge)
    • Number of Records Processed (counter)
    • Number of Records Failed (counter)

Examples

Best Practices

Metric Names and Labels

General Instrumentation

Prometheus Instrumentation

By wryfi

Prometheus Instrumentation

  • 202