API and Metrics

Mermin provides HTTP endpoints for health checks and Prometheus metrics. This page documents how to configure these services.

API Server Configuration

The API server provides health check endpoints used by Kubernetes and monitoring systems.

Configuration

api {
  enabled = true
  listen_address = "0.0.0.0"
  port = 8080
}

Configuration Options

`enabled`

Type: Boolean Default: true

Enable or disable the API server. When disabled, health check endpoints are not available.

Example:

api {
  enabled = false  # Disable API server
}

Disabling the API server prevents Kubernetes liveness and readiness probes from functioning, which may cause pods to be restarted.

`listen_address`

Type: String (IP address) Default: "0.0.0.0"

IP address the API server binds to.

Common Values:

"0.0.0.0": Listen on all interfaces (default, recommended for Kubernetes)
"127.0.0.1": Listen only on localhost (for local testing)
Specific IP: Listen on specific interface

Example:

api {
  listen_address = "127.0.0.1"  # Localhost only
}

`port`

Type: Integer Default: 8080

TCP port the API server listens on.

Example:

api {
  port = 9090  # Custom port
}

Health Check Endpoints

`/livez` - Liveness Probe

Indicates whether Mermin is alive and running.

Request:

curl http://localhost:8080/livez

Response:

200 OK: Mermin is alive
503 Service Unavailable: Mermin is not responsive

Returns: Plain text ok or error message

Use Case:

Kubernetes liveness probe
Determines if pod should be restarted

Kubernetes Configuration:

livenessProbe:
  httpGet:
    path: /livez
    port: api
  initialDelaySeconds: 30
  periodSeconds: 10
  timeoutSeconds: 5
  failureThreshold: 3

`/readyz` - Readiness Probe

Indicates whether Mermin is ready to accept traffic.

Request:

curl http://localhost:8080/readyz

Response:

200 OK: Mermin is ready (eBPF programs loaded, informers synced)
503 Service Unavailable: Mermin is not ready

Returns: Plain text ok or error message

Use Case:

Kubernetes readiness probe
Determines if pod should receive traffic
Useful for deployment coordination

Kubernetes Configuration:

readinessProbe:
  httpGet:
    path: /readyz
    port: api
  initialDelaySeconds: 15
  periodSeconds: 5
  timeoutSeconds: 5
  failureThreshold: 3

`/startup` - Startup Probe

Indicates whether Mermin has completed initial startup.

Request:

curl http://localhost:8080/startup

Response:

200 OK: Startup complete
503 Service Unavailable: Still starting up

Returns: Plain text ok or error message

Use Case:

Kubernetes startup probe
Delays liveness checks until initial startup is complete
Prevents premature restarts during slow startup

Kubernetes Configuration:

startupProbe:
  httpGet:
    path: /startup
    port: api
  initialDelaySeconds: 10
  periodSeconds: 5
  timeoutSeconds: 5
  failureThreshold: 30  # Allow up to 150s for startup

Metrics Server Configuration

The metrics server exposes Prometheus-compatible metrics for monitoring Mermin's performance and health.

Configuration

metrics {
  enabled = true
  listen_address = "0.0.0.0"
  port = 10250
}

Configuration Options

`enabled`

Type: Boolean Default: true

Enable or disable the metrics server.

Example:

metrics {
  enabled = false  # Disable metrics
}

`listen_address`

Type: String (IP address) Default: "0.0.0.0"

IP address the metrics server binds to.

Example:

metrics {
  listen_address = "127.0.0.1"  # Localhost only
}

`port`

Type: Integer Default: 10250

TCP port the metrics server listens on.

Example:

metrics {
  port = 9090  # Custom port
}

Port 10250 is chosen to align with kubelet metrics port, making it familiar to Kubernetes administrators.

Metrics Endpoint

`/metrics` - Prometheus Metrics

Exposes Prometheus-compatible metrics in text format.

Request:

curl http://localhost:10250/metrics

Response: Prometheus text format metrics

Example Metrics:

# HELP mermin_flows_total Total number of flows processed
# TYPE mermin_flows_total counter
mermin_flows_total{direction="ingress"} 12543

# HELP mermin_packets_total Total number of packets captured
# TYPE mermin_packets_total counter
mermin_packets_total{interface="eth0"} 98234

# HELP mermin_packets_dropped_total Total number of packets dropped
# TYPE mermin_packets_dropped_total counter
mermin_packets_dropped_total{reason="channel_full"} 12

# HELP mermin_flow_table_size Current number of active flows
# TYPE mermin_flow_table_size gauge
mermin_flow_table_size 456

# HELP mermin_export_errors_total Total number of export errors
# TYPE mermin_export_errors_total counter
mermin_export_errors_total{exporter="otlp"} 3

# HELP mermin_export_latency_seconds Export latency in seconds
# TYPE mermin_export_latency_seconds histogram
mermin_export_latency_seconds_bucket{le="0.01"} 1234
mermin_export_latency_seconds_bucket{le="0.05"} 2345
mermin_export_latency_seconds_bucket{le="0.1"} 3456
mermin_export_latency_seconds_sum 456.78
mermin_export_latency_seconds_count 3456

Prometheus Integration

Service Monitor (Prometheus Operator)

apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  name: mermin
  labels:
    app.kubernetes.io/name: mermin
spec:
  selector:
    matchLabels:
      app.kubernetes.io/name: mermin
  endpoints:
    - port: metrics
      interval: 30s
      path: /metrics

Pod Annotations (Prometheus Scraping)

podAnnotations:
  prometheus.io/scrape: "true"
  prometheus.io/port: "10250"
  prometheus.io/path: "/metrics"

Prometheus Scrape Config

scrape_configs:
  - job_name: 'mermin'
    kubernetes_sd_configs:
      - role: pod
    relabel_configs:
      - source_labels: [__meta_kubernetes_pod_label_app_kubernetes_io_name]
        action: keep
        regex: mermin
      - source_labels: [__meta_kubernetes_pod_ip]
        action: replace
        target_label: __address__
        replacement: $1:10250

Monitoring Dashboards

Key Metrics to Monitor

Flow Processing:

rate(mermin_flows_total[5m]): Flows per second
rate(mermin_packets_total[5m]): Packets per second
mermin_flow_table_size: Active flow count

Performance:

rate(mermin_packets_dropped_total[5m]): Packet drop rate
mermin_export_latency_seconds: Export latency
CPU and memory usage from container metrics

Errors:

rate(mermin_export_errors_total[5m]): Export failure rate
Log error count from log aggregation

Resource Usage:

container_cpu_usage_seconds_total: CPU usage
container_memory_working_set_bytes: Memory usage

Grafana Dashboard Example

{
  "dashboard": {
    "title": "Mermin Network Flows",
    "panels": [
      {
        "title": "Flows per Second",
        "targets": [
          {
            "expr": "rate(mermin_flows_total[5m])"
          }
        ]
      },
      {
        "title": "Packet Drop Rate",
        "targets": [
          {
            "expr": "rate(mermin_packets_dropped_total[5m])"
          }
        ]
      },
      {
        "title": "Active Flows",
        "targets": [
          {
            "expr": "mermin_flow_table_size"
          }
        ]
      }
    ]
  }
}

Security Considerations

Network Policies

Restrict access to API and metrics endpoints:

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: mermin-api-access
spec:
  podSelector:
    matchLabels:
      app.kubernetes.io/name: mermin
  policyTypes:
    - Ingress
  ingress:
    # Allow health checks from kubelet
    - from:
        - namespaceSelector: {}
      ports:
        - protocol: TCP
          port: 8080
    # Allow metrics scraping from Prometheus
    - from:
        - namespaceSelector:
            matchLabels:
              name: monitoring
      ports:
        - protocol: TCP
          port: 10250

Authentication

Currently, the API and metrics endpoints do not support authentication. Use network policies or service mesh policies to restrict access.

For production environments:

Use network policies to limit access
Do not expose endpoints externally
Use port-forwarding for manual access: kubectl port-forward pod/mermin-xxx 8080:8080

Complete Configuration Example

# API server for health checks
api {
  enabled = true
  listen_address = "0.0.0.0"
  port = 8080
}

# Metrics server for Prometheus
metrics {
  enabled = true
  listen_address = "0.0.0.0"
  port = 10250
}

Troubleshooting

API Endpoints Not Responding

Symptoms: Health check requests timeout

Solutions:

Verify api.enabled = true
Check port is not blocked by firewall
Verify pod is running: kubectl get pods
Check logs: kubectl logs <pod-name>

Metrics Not Scraped by Prometheus

Symptoms: No Mermin metrics in Prometheus

Solutions:

Verify metrics.enabled = true
Check Prometheus configuration
Verify pod annotations or ServiceMonitor
Test manual scrape: curl http://pod-ip:10250/metrics
Check network policies

High Metrics Cardinality

Symptoms: Too many unique metric series

Solutions:

Limit labels in metrics
Use aggregation in queries
Adjust Prometheus retention

Next Steps

Global Options: Configure logging and performance
Flow Span Options: Tune flow generation
OTLP Exporter: Configure flow export
Troubleshooting Performance: Diagnose issues

Last updated 1 month ago

API Server Configuration

Configuration

Configuration Options

enabled

listen_address

port

Health Check Endpoints

/livez - Liveness Probe

/readyz - Readiness Probe

/startup - Startup Probe

Metrics Server Configuration

Configuration

Configuration Options

enabled

listen_address

port

Metrics Endpoint

/metrics - Prometheus Metrics

Prometheus Integration

Service Monitor (Prometheus Operator)

Pod Annotations (Prometheus Scraping)

Prometheus Scrape Config

Monitoring Dashboards

Key Metrics to Monitor

Grafana Dashboard Example

Security Considerations

Network Policies

Authentication

Complete Configuration Example

Troubleshooting

API Endpoints Not Responding

Metrics Not Scraped by Prometheus

High Metrics Cardinality

Next Steps

`enabled`

`listen_address`

`port`

`/livez` - Liveness Probe

`/readyz` - Readiness Probe

`/startup` - Startup Probe

`enabled`

`listen_address`

`port`

`/metrics` - Prometheus Metrics